JOINT INVENTORS 



"EXPRESS MAIL" mailing lab I No. 

EM362726224US. 

Date of Deposit: July 9, 1 999 

I hereby certify that this paper (or fee) is being 

deposited with the United States Postal 

Service "EXPRESS MAIL POST OFFICE TO 

ADDRESSEE" service under 37 CFR §1.10 on 

the date indicated above and is addressed to: 

Assistant Commissioner for Patents, 

Washirigton, D.C. 20231 



APPLICATION FOR 
UNITED STATES LETTERS PATENT 



SPECIFICATION 

TO ALL WHOM IT MAY CONCERN: 

Be it known that we, John Ford a citizen of United States, 
residing at 2763 S. Norfollc, #210, in the City of San Matero and State of 
California, 94403 and Julio J. Mulero a citizen of United States, residing at 
892 Southampton Drive, in the City of Palo Alto and State of California, 
94303 have invented a new and useful METHODS AND MATERIALS 
RELATING TO NOVEL CD39.LIKE POLYPEPTIDES, of which the following is 
a specification. 



METHODS AND MATERIALS RELATING TO 
NOVEL CD39-LIKE POLYPEPTIDES 



1. RELATED APPLICATIONS 

5 This patent application Is a continuation-in-part of U.S. patent 

application Serial No. 09/273,447 filed March 19, 1999 which is a continuation- 
in-part of U.S. patent application Serial No. 09/122,449 filed July 24, 1998 and 
also a continuation-in-part of U.S. patent application Serial No. 09/244,444 
[ATTORNEY DOCKET NO. 2041 1-745CON1] filed February 4, 1999, which in 
10 turn is a continuation-in-part of U.S. patent application Serial No. 09/118,205 
(Attorney Docket No. 20411-745), filed July 16, 1998, the disclosures of ail of 
which are incorporated by reference herein in their entirety. 

2. FIELD OF THE INVENTION 

15 This invention relates in general to novel polynucleotides isolated 

from cDNA libraries of human fetal liver-spleen and macrophages and to 
polypeptides encoded by these polynucleotides. In particular, the invention 
relates to a human CD39-like protein with homologies to ATP 
diphosphohydrolases and variants thereof. 

20 

3. BACKGROUND 

CD39 (cluster of differentiation 39) is a cell-surface molecule 
recognized by a "cluster" of monoclonal antibodies that can be used to identify 
the lineage or stage of differentiation of lymphocytes and thus to distinguish one 

25 class of lymphocytes from another. This CD39 molecule was originally defined 
as a B lymphocyte marker (Rowe, M., et al. Int. J. Cancer 29:373 (1982)). 
Subsequent studies have shown CD39 to be a marker for a distinct subset of 
activated lymphocytes within the allosensitized CD8-positive cytotoxic cells 
(Gouttefangeas C, et al., Eur. J.lmmunol. 22:2681 (1992)). Outside of lymphoid 

30 tissue, CD39 can be found in quiescent vascular endothelial cells (Kansas, G. 
S., et al., J. Immunol. 146:2235 (1991)) and throughout rat brain in the neurons 



of the cerebral cortex, hippocampus, and cerebellum, as well as in glial cells 
(Wang. T-F. and Guldotti, G., Brain Res. 790:318 (1998)). 

CD39 is a 510-amino acid protein with a predicted molecular mass 
of 57 kDa. However, because of heavy glycosylation at asparagine residues (six 
5 potential N-glycosylation sites) the molecule displays a mobility closer to 100 
kDa (Maliszewski, C. R., et al., J. Immunol. 153:3574 (1994)). CD39 contains 
two hydrophobic regions, one near the amino terminus and the other near the 
carboxyl terminus which are believed to be transmembrane regions. 

Reports that several ATP Diphosphohydrolases (ATPDases) share 

10 amino acid sequence homology with CD39 have been substantiated by showing 
that CD39 is itself an ATPDase (Wang, T- F., et al.. J. Biol. Chem. 271:9898 
(1996); Kaczmarek, E., eta!., J. Biol. Chem. 271:33116 (1996)). Since CD39 is 
a plasma membrane-bound enzyme, CD39 has been termed an "ecto-ATPase," 
but CD39 is more often referred to as an "ecto-apyrase" because of the reduced 

15 rate of hydrolysis of ADP when compared with ecto-ATPases. 

This activity has shown to modulate platelet reactivity and 
aggregation in response to vascular injury. During vascular injury, activated 
platelets aggregate forming an occlusive thrombus. Excessive platelet 
accumulation at sites of vascular injury can contribute to vessel occlusion. 

20 Endothelial cells respond to the potentially occlusive effects of platelet 
aggregation by several mechanisms. One of these mechanisms results 
ecto-apyrase-mediated removal of ADP, which in turn eliminates platelet 
reactivity and recruitment. It is now known that the endothelial ecto-apyrase 
responsible for this ADP removal is CD39 (Marcus. A. J., et al., J. Clin. Invest. 

25 99:1351 (1997)). 

Recently, CD39 was engineered to produce a soluble form of the 
molecule. This soluble CD39 was shown to display the same nucleotidase 
activity as the membrane-bound molecule (Gayle, R. B., et al., J. Clin. Invest. 
101:1851 (1998)). Intravenously administered soluble CD39 also remained 

30 active in mice for an extensive period of time, indicating that soluble CD39 could 
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be useful as a inhibitor of platelet aggregation in the prophylaxis or treatment of 
platelet-mediated thrombotic conditions. 

Platelet aggregation Inhibitors (antithrombotic agents) decrease 
the formation or the action of chemical signals that promote platelet aggregation. 
5 Currently available antithrombotic agents include aspirin, ticlopidine, and 
dipyridamole. These agents have proven beneficial in the prevention and 
treatment of occlusive cardiovascular diseases, including myocardial infarction, 
cerebral ischemia, angina. Antithrombotic therapy has also been used in the 
maintenance of vascular grafts. 

10 Myocardial infarction is the development of necrosis of the 

myocardium (the middle muscular layer of the heart wall) due to a critical 
imbalance between oxygen and myocardial demand. The most common cause 
of acute myocardium infarction is narrowing of the epicardial blood vessels due 
to atheromatous plaques. Plaque rupture with subsequent exposure of 

15 basement membrane results in platelet aggregation and thrombus formation, 
which can result in partial or complete occlusion of the vessel and subsequent 
myocardial ischemia. 

In cerebral ischemia, inadequate blood flow results from an 
occlusion in a blood vessel or hemorrhaging. In the latter case, excessive 

20 bleeding in one area of the brain deprives another area of blood. If the damage 
occurs in a singular small area, "transient" or "focused" cerebral ischemia 
results. When a major artery is blocked (carotid artery) global or diffused 
ischemia results. The primary medical strategy for secondary prevention of 
stroke is antiplatelet therapy. Aspirin is currently employed for reducing the risk 

25 of recurrent transient ischemic attacks or stroke in men who have transient 
ischemia of the brain due to fibrin emboli. 

Each year, thousands of patients suffer a decline In blood flow to 
one or more limbs. Without sufficient blood flow, and, unless blood flow can be 
restored in time, the limb must be amputated. In some cases, grafts from the 

30 patient's veins can be used to form new arteries. However, in cases where the 
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quality of the veins is poor, polymeric vascular grafts are typically used. The 
polymeric grafts are inherently thrombogenlc as the blood constituents passing 
through the grafts become activated and tend to form clots. Efforts to line the 
grafts with endothelial cells can reduce blood clotting, but better results are 
5 obtained when antithrombotic therapy is employed. 

Angina pectoris is a characteristic chest pain caused by 
inadequate blood flow through the blood vessels of the myocardium. The 
imbalance between oxygen delivery and utilization may result from a spasm of 
the vascular smooth muscle or from obstruction of blood vessels caused by 

10 atherosclerotic lesions. Three classes of drugs have been shown to be effective 
in treating angina: nitrates, beta-blockers and calcium channel blockers. 
Currently, the antithrombotics dipyridamole and aspirin are employed to 
prophylactically treat angina pectoris. 

Ecto-apyrases, such as CD39, offer a number of advantages over 

15 several of the standard antithrombotics. For example, aspirin treatment controls 
the prothrombotic action of thromboxane; however, aspirin also prevents 
formation of antithrombotic prostacyclin, which limits aspirin's efficacy. Another 
antithrombotic, endothelium-derived relaxing factor (nitric oxide; "EDRF/NO"), 
is inhibited in vitro and in vivo by hemoglobin after its rapid diffusion into 

20 erythrocytes. In contrast, CD39 is aspirin-insensitive and completely inhibits 
platelet reactivity even when eicosanoid and EDRF/NO production are blocked. 

CD39's ATPDase activity also implicates CD39 in the modulation 
of neurotransmission. ATP is a major purinergic neurotransmitter that is often 
co-released into the synaptic cleft with several neurotransmitters. Responses 

25 to ATP are mediated by specific plasma membrane receptors, called P2 
purinergic receptors (Dubyak, G. R. and EI-Motassim,C. Am J. Physiol. 
34:C577-C606 (1993)). The distribution of CD39 In the rat brain indicates that 
CD39 plays a role in terminating P2 purinergic neurotransmission (Wang, T. F. 
and Guidotti, G., Brain Res. 790:318 (1998)). Furthermore, a decrease in 

30 ecto-apyrase activity is believed to lead to an accumulation of the excitatory 
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neurotransmitter, extracellular ATP, as well as a deficiency of the endogenous 
anticonvulsant extracellular adenosine. 

The chomosomal localization of CD39 provides additional support 
for a role In modulation of neurotransmission. More specifically, CD39 has been 
5 mapped to chromosome 10q 23.1-24.1 (Maliszewski, C. R., et al., J. Immunol. 
153:3574 (1994)), and this site overlaps with the susceptibility locus for human 
partial epilepsy with audigenic symptoms (Ottman, R. et al., Nature Genet. 10:56 
(1995)). This co-localization of the CD39 gene and the susceptibility locus has 
led to the hypothesis that decrease In ecto-apyrase activity In the brain Is the 
10 primary cause of partial epilepsy (Wang T-F., et al., Mol. Brain Res. 47:295 
(1997)). 

A screen for human cDNAs that hybridize to cosmids from the 
human chromosome 9q34 region lead to the identification of a transcript with 
high homology to a chicken muscle ecto-ATPase (60% Identity) and the 

15 ecto-apyrase CD39 (41% amino acid Identity) (Chadwick, B. P., Mamm. Genome 
8:668 (1997)). This gene, designated "CD39-like-1 gene" (CD39L1), has a 
higher degree of homology to CD39 than does chicken muscle ecto-ATPase. 
The biological activity of this protein has not been tested but on the basis of the 
high amino acid homology, CD39L1 is believed to be a new member of the 

20 ecto-ATPase family. Recently, a mouse gene with homology to NTPases was 
cloned and sequenced (Acc. No. AF006482) by Chadwick et al. (Mamm. Gen. 
9:162-164(1998).) 

4. SUMMARY OF THE INVENTION 

25 The invention is based on polynucleotides isolated from cDNA 

libraries prepared from human fetal liver-spleen and macrophages. The 
compositions of the present invention include novel isolated polypeptides with 
apyrase and/or NDPase activity, in particular, novel human CD39-like 
polypeptides, and active variants thereof. Isolated polynucleotides encoding 

30 such polypeptides, including recombinant DNA molecules, cloned genes or 
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degenerative variants thereof, especially naturally occurring variants such as 
allelic variants, antlsense polynucleotide molecules, and antibodies that 
specifically recognize one or more epitopes present on such polypeptides, as 
well as hybridomas producing such antibodies. 
5 The compositions of the invention additionally include vectors, 

including expression vectors, containing the polynucleotides of the invention, 
cells genetically engineered to contain such polynucleotides and cells 
genetically engineered to express such polynucleotides. 

The polynucleotides of the invention include naturally occurring or 

10 wholly or partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., 
mRNA. The isolated polynucleotides of the invention include a polynucleotide 
encloding a polypeptide comprising the amino acid sequence of SEQ ID NO. 
3. The isolated polynucleotides of the invention further include a polynucleotide 
comprising the nucleotide sequence of SEQ ID NO. 2. The polynucleotides of 

15 the invention also include polynucleotides that encode polypeptides with a 
biological activity of the polypeptide of SEQ ID NO. 3 (including apyrase or 
NTPase activity) and that hybridize under stringent hybridization conditions to 
the complement of (a) the nucleotide sequence of SEQ ID NO. 2, or (b) a 
nucleotide sequence encoding the amino acid sequence of SEQ ID NO. 3; a 

20 polynucleotide which is an allelic variant of any polynucleotide recited above; a 
polypeptide which has at least 80% sequence identity to a polynucleotide of 
SEQ ID NO. 2; or a polynucleotide that encodes a polypeptide comprising at 
least one CD39-like domain, e.g. catalytic domain. 

The polynucleotides of the invention additionally include the complement 

25 of any of the polynucleotides recited above. 

One polynucleotide according to the invention encodes a novel 
CD39-Ilke protein having the amino acid sequence shown in Figure 2 (SEQ ID 
NO. 3), whidi has been designated CD39-L66 and is an isoform of the CD39-L4. 
The invention also provides a polynucleotide including a nucleotide sequence 

30 that is substantially equivalent to this polynucleotide. Polynucleotides according 



6 



to the invention can have at least about 80%, more typically at least about 90%, 
and even more typically at least about 95%, sequence identity to a 
polynucleotide encoding a polypeptide including SEQ ID NO. 3. 

A further aspect of the invention is the development of novel CD39- 
5 L4 variants which have improved ADPase activity compared to wild type CD39- 
L4 (SEQ ID NO: 5). A preferred variant, designated ACRID herein, has the 
amino acid sequence set forth in SEQ ID NO: 7. The invention further provides 
polypeptides comprising at least one amino acid substitution selected from the 
group consisting of: D168-T, S170-Q and L175-F, wherein said substitution(s) 

10 result in increased ADPase activity of the polypeptide. One preferred 
embodiment is the polypeptide having the sequence set forth in SEQ ID NO: 7, 
which is a variant CD39-L4 containing all three substitutions that has been 
designated ACRIII. Alternatively, instead of making the specific D168-T, 
S170-Q and/or L175-F substitution(s), substitution of amino acids with similar 

15 properties is contemplated. Additional conservative substitutions at amino acid 
positions other than D168, S170 and/or L175 are further contemplated. For 
example, all of the corresponding amino acids from CD39 could be substituted 
for amino acids 167-181 of CD39-L66 or CD39-L4. 

Polynucleotides encoding these variants, vectors and host cells 

20 comprising such polynucleotides, methods of using such host cells to produce 
polypeptides, and other therapeutic products comprising the polypeptides 
(including fusion proteins in which the ACRIII is fused to a heterologous peptide 
or polypeptide, such as an immunoglobulin constant region, or derivatives in 
which ACRIII is modified by water soluble polymers to increase Its half-life) are 

25 also comprehended by the Invention, as are methods of treating a subject 
suffering from a disorder relating to thrombosis, coagulation or platelet 
aggregation by administering such therapeutic products. 

Gene therapy techniques are also provided to modulate disease 
states associated with CD39-L4 expression and/or biological activity. Delivery 

30 of a functional CD39-L4 gene to appropriate cells is effected ex vivo, In situ, or 
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in vivo by use of vectors, and more particularly viral vectors (e.g., adenovirus, 
adeno-associated virus, or a retrovirus), or ex vivo by use of physical DNA 
transfer methods (e.g., liposomes or chemical treatments). 

The invention also relates to methods for producing polypeptides 
5 of the invention comprising growing a culture of cells of the invention in a 
suitable culture medium under conditions permitting expression of the desired 
polypeptide, and purifying the protein from the cells or the culture medium. 
Preferred embodiments include those in which the protein produced by such 
process is a mature form of the protein. 
10 Protein compositions of the present invention, including therapeutic 

compositions, comprise polypeptides of the invention and optionally an 
acceptable carrier, such as a hydrophilic (e.g., pharmaceutically acceptable) 
carrier. 

Polynucleotides according to the invention have numerous 
15 applications in a variety of techniques known to those skilled in the art of 
molecular biology. These techniques include use as hybridization probes, use 
as oligomers for PGR, use for chromosome and gene mapping, use in the 
recombinant production of protein, and use in generation of anti-sense DNA or 
RNA, their chemical analogs and the like. For example, because the expression 
20 of CD39-like mRNA is largely restricted to macrophages, polynucleotides of the 
Invention can be used as hybridization probes to detect the presence of 
macrophage mRNA in a sample using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in 
diagnostics as expressed sequence tags for identifying expressed genes or, as 
25 well known in the art and exemplified by Vollrath et al., Science 258:52-59 
(1992), as expressed sequence tags for physical mapping of the human 
genome. 

A polynucleotide according to the invention can be joined to any 
of a variety of other nucleotide sequences by well-established recombinant DNA 
30 techniques (see Sambrook J et al. (1989) Molecular Cloning: A Laboratory 
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Manual, Cold Spring Harbor Laboratory, NY). Useful nucleotide sequences for 
joining to polypeptides include an assortment of vectors, e.g., plasmids, 
cosmids, lambda phage derivatives, phagemids, and the like, that are well 
known in the art. Accordingly, the invention also provides a vector including a 
5 polynucleotide of the invention and a host cell containing the polynucleotide. In 
general, the vector contains an origin of replication functional in at least one 
organism, convenient restriction endonuclease sites, and a selectable marker 
for the host cell. Vectors according to the invention include expression vectors, 
replication vectors, probe generation vectors, and sequencing vectors. A host 

10 cell according to the invention can be a prokaryotic or eukaryotic cell and can 
be a unicellular organism or part of a multicellular organism. 

The polypeptides according to the invention can be used in a 
variety of conventional procedures and methods that are currently applied to 
other proteins. For example, a polypeptide of the invention can be used to 

15 generate an antibody \lA^\ch specifically binds the polypeptide. The polypeptides 
of the Invention having ATPDase activity are also useful for inhibiting platelet 
aggregation and can therefore be employed in the prophylaxis or treatment of 
pathological conditions caused by the inflammatory response. The polypeptides 
of the invention can also be used as molecular weight markers, and as a food 

20 supplement. 

Another aspect of the invention is an antibody that specifically 
binds the polypeptide of the invention. Such antibodies can be either 
monoclonal or polyclonal antibodies, as well fragments thereof and humanized 
forms or fully human forms, such as those produced in transgenic animals. The 
25 invention further provides a hybrldoma that produces an antibody according to 
the invention. 

Antibodies of the invention are useful for detection and/or 
purification of the polypeptides of the invention. 

Methods are also provided for preventing, treating or ameliorating 
30 a medical condition, including thrombotic diseases, which comprises 
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administering to a mammalian subject, including but not limited to humans, a 
therapeutically effective amount of a composition comprising a polypeptide of 
the invention or a therapeutically effective amount of a composition comprising 
a binding partner of (e.g., antibody specifically reactive for) CD39-like 
5 polypeptides of the Invention. The mechanics of the particular condition or 
pathology will dictate whether the polypeptides of the invention or binding 
partners (or inhibitors) of these would be beneficial to the individual in need of 
treatment. 

The invention also provides a method of inhibiting platelet function 

10 comprising administering a CD39-L4 polypeptide of the invention to a medium 
comprising platelets. According to this method, polypeptides of the invention 
can be administered to produce an in vitro or in vivo inhibition of platelet 
function. A polypeptide of the invention can be administered in vivo as 
antithrombotic agent alone or as an adjunct to other therapies. 

15 The invention also provides methods for detecting or quantitating 

the presence of the polynucleotides or polypeptides of the invention in a tissue 
or fluid sample, and corresponding kits that comprise suitable polynucleotide 
probes or antibodies, together with an optional quantitative standard. Such 
methods and kits can be utilized as part of prognostic and diagnostic evaluation 

20 of patients and for the identification of subjects exhibiting a predisposition to 
platelet mediated conditions. 

The invention also provides methods for the identification of 
compounds that modulate (i.e. increase or decrease) the expression or activity 
of the polynucleotides and/or polypeptides of the invention. Such methods can 

25 be utilized, for example, for the identification of compounds and other 
substances that interact with (e.g., bind to) the polypeptides of the invention, and 
assays for identifying compounds and other substances that enhance or inhibit 
the activity of the polypeptides of the invention, such assays comprising the step 
of measuring activity of such polypeptides in the presence and absence of the 

30 test compound. 
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5. BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 shows polynucleotide sequences according to the invention. 
SEQ ID N0:1 was obtained from the b2HFLS20W cDNA library using standard 
per, sequencing by hybridization signature analysis, and single pass gel 
5 sequencing technology. A- adenosine; C-cytosine; G-guanosine; T-thymine. 
Ambiguous positions are designated as follows: R indicates A or G; M indicates 
A or C; W indicates A or T; Y indicates C or T; 8 indicates C or G; K indicates 
G or T; V indicates A or C or G; H indicates A or C or T; D indicates A or G or 
T; B indicates C or G or T; and N indicates any of the four bases. 

10 SEQ ID N0:2 is an extended version of SEQ ID NO:1 which was 

obtained as described In Example 34. 

FIG. 2 shows an amino acid sequence corresponding to the 
polynucleotide sequence of SEQ ID N0:2. This sequence is designated as SEQ 
ID N0:3. The open reading frame encoding SEQ ID NO: 3 begins at nucleotide 

15 246 (numbered from the 5' end) of SEQ ID N0:2. A- Alanine; R- Arginine; N- 
Asparagine; D- Aspartic Acid; C- Cysteine; E- Glutamic Acid; Q- Glutamine; G- 
Glycine; H- Histidine; I- Isoleucine; L- Leucine; K- Lysine; M- Methionine; F- 
Phenylalanine; P- Proline; S- Serine; T- Threonine; W- Tryptophan; Y- Tyrosine; 
V- Valine; X - any of the twenty amino acids. 

20 FIG. 3 shov\/s the amino acid sequence alignment of SEQ ID N0:3 

(identified as "246 prot") and human CD39 ("CD39Human.seq"). The amino acid 
residues are designated as for FIG. 2. The alignment was generated using the 
J. Hein method with the PAM250 residue weight table. Gaps are indicated by 
dashes; residues that are identical between the two sequences (within 1 

25 distance unit) are boxed. 

FIG. 4 shows the amino acid sequence alignment of SEQ ID N0:3 
(identified as "264 prot") and murine NTPase ("mur ntpase"). The amino acid 
residues are designated as for FIG. 2. The alignment was generated as 
discussed for FIG. 3 Gaps are indicated by dashes; residues that are identical 

30 between the two sequences (within 1 distance unit) are boxed. 
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FIG. 5 shows the apyrase conserved regions (ACR) in CD39-L4 
in bold. ACR I starts at Phe 53, ACR II starts at Pro 124 and ACR III starts at 
Met 167. The boxed sections highlight the amino acid substitutions that were 
made in the wild type CD39-L4 amino acid sequence to form mutants designated 
5 ACRI, ACRII and ACRIII. 

FIG. 6 (SEQ ID NOS: 6 and 7) shows the nucleotide and 
corresponding amino acid sequences of a preferred ACRIII mutant containing 
the following substitutions in the wild type CD39-L4 amino acid sequence: 
0168-7, 8170-^0 and L175-F. 
10 FIG. 7 shows the ADPase activity of CD39-L4 variants ACRI, 

ACRII and ACRIII in comparison to wild type CD39-L4: (1) CD39-L4 ACR I 
mutant; (2) CD39-L4 ACR II mutant; (3) CD39-L4 ACR III mutant; (4) CD39-L4 
wild type; (5) sCD39; and (6) pSecTag2 vector (Invitrogen). 

15 

6. DETAILED DESCRIPTION 
6.1 Definitions 

The term "nucleotide sequence" refers to a heteropolymer of 
20 nucleotides or the sequence of of these nucleotides. The terms "nucleic acid" 
and "polynucleotide" are also used interchangeably herein to refer to a 
heteropolymer of nucleotides. Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short 
oligonucleotide linkers, or from a series of oligonucleotides, to provide a 
25 synthetic nucleic acid which is capable of being expressed in a recombinant 
transcriptional unit comprising regulatory elements derived from a microbial or 
viral operon. 

An "oligonucleotide fragment" or a "polynucleotide fragment", 
"portion," or "segment" is a stretch of polypeptide nucleotide residues which is 
30 long enough to use in polymerase chain reaction (PCR) or various hybridization 
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procedures to identify or amplify identical or related parts of mRNA or DNA 
molecules. 

"Oligonucleotides" or "nucleic acid probes" are prepared based on 
the cDNA sequence provided in the present invention. Oligonucleotides 
comprise portions of the DNA sequence having at least about 15 nucleotides 
and usually at least about 20 nucleotides. Nucleic acid probes comprise 
portions of the sequence having fewer nucleotides than about 6 kb, usually 
fewer than about 1 kb. After appropriate testing to eliminate false positives, 
these probes may be used to determine whether mRNAs are present in a cell or 
tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh PS et al (1992 PGR Methods AppI 1:241-250). 

The term "probes" includes naturally occurring or recombinant 
single- or double-stranded nucleic acids or chemically synthesized nucleic acids. 
They may be labeled by nick translation, Klenow fill-in reaction, PGR or other 
methods well known in the art. Probes of the present invention, their preparation 
and/or labeling are elaborated in Sambrook J et al (1989) Molecular Gloning: 
A Laboratory Manual, Cold Spring Harbor Laboratory, NY; or Ausubel FM et al 
(1989) Current Protocols in Molecular Biology, John Wiley & Sons, New York 
NY, both incorporated herein by reference. 

The term "stringent" is used to refer to conditions that are 
commonly understood in the art as stringent. An exemplary set of conditions 
include a temperature of 60-70 oC, (preferably about 65 oC) and a salt 
concentration of 0.70 M to 0.80 M (preferably about 0.75M). Further exemplary 
conditions include, hybridizing conditions that (1) employ low ionic strength and 
high temperature for washing, for example, 0.015 M NaCI/0.0015 M sodium 
citrate/0.1% SDS at 50(C.; (2) employ during hybridization a denaturing agent 
such as formamide, for example, 50% (vol/vol) formamide with 0.1% bovine 
serum albumln/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate 
buffer at pH 6.5 with 750 mM NaCI, 75 mM sodium citrate at 42(C; or (3) employ 
50% formamide, 5 x SSC (0.75 M NaCI, 0.075 M Sodium pyrophosphate. 5 x 
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Denhardt's solution, sonicated salmon sperm DNA (50 g/ml), 0.1% SDS, and 
10% dextran sulfate at 42(C, with washes at 42(C in 0.2 x SSC and 0.1% SDS. 

The temi "recombinant," as used herein, means that a polypeptide 
or protein Is derived from recombinant (e.g., microbial or mammalian) expression 
5 systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant 
microbial" defines a polypeptide or protein essentially free of native endogenous 
substances and unaccompanied by associated native glycosylation. 
Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will 

10 be free of glycosylation modifications; polypeptides or proteins expressed in 
yeast will have a glycosylation pattern different from that expressed in 
mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a 
plasmid or phage or virus or vector, for expressing a polypeptide from a DNA 

15 (RNA) sequence. The expression vehicle can comprise a transcriptional unit 
comprising an assembly of (1) a genetic element or elements having a 
regulatory role in gene expression, for example, promoters or enhancers, (2) a 
structural or coding sequence which is transcribed into mRNA and translated 
into protein, and (3) appropriate transcription initiation and termination 

20 sequences. Structural units intended for use in yeast or eukaryotic expression 
systems preferably include a leader sequence enabling extracellular secretion 
of translated protein by a host cell. Alternatively, where recombinant protein is 
expressed without a leader or transport sequence, it may include an N-terminal 
methionine residue. This residue may or may not be subsequently cleaved from 

25 the expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have 
stably integrated a recombinant transcriptional unit into chromosomal DNA or 
carry the recombinant transcriptional unit extrachromosomally. The cells can be 
prokaryotic or eukaryotic. Recombinant expression systems as defined herein 

30 will express heterologous polypeptides or proteins upon induction of the 
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regulatory elements linked to the DNA segment or synthetic gene to be 
expressed. 

The term "open reading frame," ORF, means a series of triplets 
coding for amino acids without any termination codons and is a sequence 
5 translatable into protein. 

The term "expression modulating fragment," EMF, means a series 
of nucleotide molecules which modulates the expression of an operably linked 
ORF or EMF. 

As used herein, a sequence is said to "modulate the expression of an operably 
10 linked sequence" when the expression of the sequence is altered by the 
presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are 
fragments v\^ich induce the expression or an operably linked ORF in response 
to a specific regulatory factor or physiological event. 
15 As used herein, an "uptake modulating fragment," UMF, means a 

series of nucleotide molecules which mediate the uptake of a linked DNA 
fragment into a cell. UMFs can be readily identified using known UMFs as a 
target sequence or target motif with the computer-based systems described 
above. 

20 The presence and activity of a UMF can be confirmed by attaching 

the suspected UMF to a marker sequence. The resulting nucleic acid molecule 
is then incubated with an appropriate host under appropriate conditions and the 
uptake of the marker sequence is determined. As described above, a UMF will 
increase the frequency of uptake of a linked marker sequence. 

25 "Active" refers to those forms of the polypeptide which retain the 

biologic and/or immunologic activities of any naturally occurring polypeptide. 

"Naturally occurring polypeptide" refers to polypeptides produced 
by cells that have not been genetically engineered and specifically contemplates 
various polypeptides arising from post-translational modifications of the 
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polypeptide including, but not limited to, acetylation, carboxylation, glycosylation, 
phosphorylation, lipidation and acylation. 

"Derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various 
5 enzymes), pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do 
not normally occur in human proteins. 

"Recombinant variant" refers to any polypeptide differing from 
naturally occurring polypeptides by amino acid insertions, deletions, and 

10 substitutions, created using recombinant DNA techniques. Guidance in 
determining which amino acid residues may be replaced, added or deleted 
without abolishing activities of interest, such as cellular trafficking, may be found 
by comparing the sequence of the particular polypeptide with that of homologous 
peptides and minimizing the number of amino acid sequence changes made in 

1 5 regions of high homology. 

Preferably, amino acid "substitutions" are the result of replacing 
one amino acid with another amino acid having similar structural and/or 
chemical properties, such as the replacement of a leucine with an isoleucine or 
valine, an aspartate with a glutamate, or a threonine with a serine, i.e., 

20 conservative amino acid replacements. "Insertions" or "deletions" are typically 
in the range of about 1 to 5 amino acids. The variation allowed may be 
experimentally determined by systematically making insertions, deletions, or 
substitutions of amino acids in a polypeptide molecule using recombinant DNA 
techniques and assaying the resulting recombinant variants for activity. 

25 As used herein, "substantially equivalent" can refer both to 

nucleotide and amino acid sequences, for example a mutant sequence, that 
varies from a reference sequence by one or more substitutions, deletions, or 
additions, the net effect of which does not result in an adverse functional 
dissimilarity between the reference and subject sequences. Typically, such a 

30 mutant sequence varies from one of those listed herein by no more than about 
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20% (i.e., the number of substitutions, additions, and/or deletions in a mutant 
sequence, as compared to the corresponding listed sequence, divided by the 
total number of residues in the mutant sequence is about 0.2 or less). Such a 
mutant sequence is said to have 80% sequence Identity to the listed sequence. 
5 In one embodiment, a mutant sequence of the invention varies from a listed 
sequence by no more than 10% (90% sequence identity), in a variation of this 
embodiment, by no more than 5% (95% sequence identity), and in a further 
variation of this embodiment, by no more than 2% (98% sequence identity). 
Mutant amino acid sequences according to the Invention generally have at least 

10 95% sequence identity v^^ith a listed amino acid sequence, vy/hereas mutant 
nucleotide sequence of the Invention can have lower percent sequence 
identities. For the purposes of the present invention, sequences having 
substantially equivalent biological activity and substantially equivalent 
expression characteristics are considered substantially equivalent For the 

15 purposes of determining equivalence, truncation of the mature sequence should 
be disregarded. 

Where desired, an expression vector may be designed to contain 
a "signal or leader sequence" which will direct the polypeptide through the 
membrane of a cell. Such a sequence may be naturally present on the 
20 polypeptides of the present invention or provided from heterologous protein 
sources by recombinant DNA techniques. 

A polypeptide "fragment," "portion," or "segment" Is a stretch of 
amino acid residues of at least about 5 amino acids, often at least about 7 amino 
acids, typically at least about 9 to 13 amino acids, and, in various embodiments, 
25 at least about 17 or more amino acids. To be active, any polypeptide must have 
sufficient length to display biologic and/or immunologic activity. 

Alternatively, recombinant variants encoding these same or similar 
polypeptides may be synthesized or selected by making use of the "redundancy" 
in the genetic code. Various codon substitutions, such as the silent changes 
30 which produce various restriction sites, may be introduced to optimize cloning 
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into a plasmid or viral vector or expression in a particular prokaryotic or 
eukaryotic system. Mutations in the polypeptide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify 
the properties of any part of the polypeptide, to change characteristics such as 
S ligand-binding affinities, interchain affinities, or degradation/turnover rate. 

"Activated" cells as used in this application are those which are 
engaged in extracellular or intracellular membrane trafficking, including the 
export of neurosecretory or enzymatic molecules as part of a normal or disease 
process. 

10 The term "purified" as used herein denotes that the indicated 

nucleic acid or polypeptide is present in the substantial absence of other 
biological macromolecules, e.g., polynucleotides, proteins, and the like. In one 
embodiment, the polynucleotide or polypeptide is purified such that it constitutes 
at least 95% by weight, more preferably at least 99.8% by weight, of the 

15 indicated biological macromolecules present (but water, buffers, and other small 
molecules, especially molecules having a molecular weight of less than 1000 
daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or 
polypeptide separated from at least one other component (e.g., nucleic acid or 

20 polypeptide) present with the nucleic acid or polypeptide in its natural source. 
In one embodiment, the nucleic acid or polypeptide is found in the presence of 
(if anything) only a solvent, buffer, ion, or other component normally present in 
a solution of the same. The terms "isolated" and "purified" do not encompass 
nucleic acids or polypeptides present in their natural source. 

25 The term "infection" refers to the introduction of nucleic acids into 

a suitable host cell by use of a virus or viral vector. 

The term "transformation" means introducing DNA into a suitable 
host cell so that the DNA is replicable, either as an extrachromosomal element, 
or by chromosomal integration. 
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The term "transfection" refers to the taking up of an expression 
vector by a suitable host cell, whether or not any coding sequences are in fact 
expressed. 

The term "intermediate fragment" means a nucleic acid between 
S 5 and 1000 bases in length, and preferably between 10 and 40 bp in length. 

Each of the above terms is meant to encompasses all that is 
described for each, unless the context dictates otherwise. 

6.4 Hybridization Conditions 

10 Suitable hybridization conditions may be routinely determined by 

optimization procedures or pilot studies. Such procedures and studies are 
routinely conducted by those skilled in the art to establish protocols for use in 
a laboratory. See e.g., Ausubel et al., Current Protocols in Molecular Biology, 
Vol. 1-2, John Wiley & Sons (1989); Sambrook et al., Molecular Cloning A 

15 Laboratory Manual, 2nd Ed., Vols. 1-3, Cold Springs Harbor Press (1989); and 
Maniatis et al.. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory Cold Spring Harbor, New York (1982), all of which are incorporated 
by reference herein. For example, conditions such as temperature, 
concentration of components, hybridization and washing times, buffer 

20 components, and their pH and ionic strength may be varied. 

6.7 Nucleic Acids of the Invention 

The sequences falling within the scope of the present invention are 
not limited to the specific sequences herein described, but also include allelic 

25 variations thereof. Allelic variations can be routinely determined by comparing 
the sequence provided in SEQ ID N0s:1-2, a representative fragment thereof, 
or a nucleotide sequence at least 99.9% identical to SEQ ID NO: 1-2 with a 
sequence from another Isolate of the same species. Furthermore, to 
accommodate codon variability, the invention includes nucleic acid molecules 

30 coding for the same amino acid sequences as do the specific ORFs disclosed 
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herein. In other words, in the coding region of an ORF, substitution of one 
codon for another which encodes the same amino acid is expressly 
contemplated. 

Any specific sequence disclosed herein can be readily screened 
5 for errors by resequencing a particular fragment, such as an ORF, in both 
directions (i.e., sequence both strands). 

The present invention further provides recombinant constructs 
comprising a nucleic acid having the sequence of any one of SEQ ID NOs: 1-2 
or a fragment thereof. The recombinant constructs of the present invention 

10 comprise a vector, such as a plasmid or viral vector, into which a nucleic acid 
having the sequence of any one of SEQ ID NOs 1-2 or a fragment thereof is 
inserted, in a forward or reverse orientation. In the case of a vector comprising 
one of the ORFs of the present invention, the vector may further comprise 
regulatory sequences, including for example, a promoter, operabty linked to the 

15 ORF. For vectors comprising the EMFs and UMFs of the present invention, the 
vector may further comprise a marker sequence or heterologous ORF operably 
linked to the EMF or UMF. Large numbers of suitable vectors and promoters are 
known to those of skill in the art and are commercially available for generating 
the recombinant constructs of the present invention. The following vectors are 

20 provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript 
SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, 
pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo. 
pSV2cat, pOG44, PXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

25 Promoter regions can be selected from any desired gene using 

CAT (chloramphenicol transferase) vectors or other vectors with selectable 
markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named 
bacterial promoters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. 
Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early 

30 and late SV40, LTRs from retrovirus, and mouse metallothionein-l. Selection of 
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the appropriate vector and promoter is well within the level of ordinary skill in the 
art. 

Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, 
5 e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and 
a promoter derived from a highly-expressed gene to direct transcription of a 
downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), 
a-factor, acid phosphatase, or heat shock proteins, among others. The 

10 heterologous structural sequence is assembled in appropriate phase with 
translation initiation and termination sequences, and preferably, a leader 
sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous 
sequence can encode a fusion protein including an N-terminal identification 

15 peptide imparting desired characteristics, e.g., stabilization or simplified 
purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by 
inserting a structural DNA sequence encoding a desired protein together with 
suitable translation initiation and termination signals in operable reading phase 

20 with a functional promoter. The vector will comprise one or more phenotypic 
selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable 
prokaryotic hosts for transformation include E. coli, Bacillus subtilis. Salmonella 
typhimurium and various species within the genera Pseudomonas, 

25 Streptomyces, and Staphylococcus, although others may also be employed as 
a matter of choice. 

As a representative but nonlimiting example, useful expression 
vectors for bacterial use can comprise a selectable marker and bacterial origin 
of replication derived from commercially available plasmids comprising genetic 

30 elements of the well known cloning vector pBR322 (ATCC 37017). Such 
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commercial vectors include, for example, pKK223-3 (Pharmacia Fine Chemicals, 
Uppsala, Sweden) and GEM 1 (Promega Biotec, Madison, Wl, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 
5 Following transformation of a suitable host strain and growth of the 

host strain to an appropriate cell density, the selected promoter is Induced or 
derepressed by appropriate means (e.g., temperature shift or chemical 
induction) and cells are cultured for an additional period. Cells are typically 
harvested by centrifugation, disrupted by physical or chemical means, and the 

10 resulting crude extract retained for further purification. 

Included within the scope of the nucleic acid sequences of the 
invention are nucleic acid sequences that hybridize under stringent conditions 
to a fragment of the DNA sequences in Figure 1 , which fragment is greater than 
about 10 bp, preferably 20-50 bp, and even greater than 100 bp. 

15 In accordance with the invention, polynucleotide sequences which 

encode the novel nucleic acids, or functional equivalents thereof, may be used 
to generate recombinant DNA molecules that direct the expression of that 
nucleic acid, or a functional equivalent thereof, in appropriate host cells. 

The nucleic acid sequences of the invention are further directed 

20 to sequences which encode variants of the described nucleic acids. These 
amino acid sequence variants may be prepared by methods known in the art by 
introducing appropriate nucleotide changes into a native or variant 
polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. 

25 The amino acid sequence variants of the nucleic acids are preferably 
constructed by mutating the polynucleotide to give an amino acid sequence that 
does not occur in nature. These amino acid alterations can be made at sites 
that differ in the nucleic acids from different species (variable positions) or in 
highly conserved regions (constant regions). Sites at such locations will 

30 typically be modified in series, e.g., by substituting first with conservative 
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choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid) 
and then with more distant choices (e.g., hydrophobic amino acid to a charged 
amino acid), and then deletions or insertions may be made at the target site. 

Amino acid sequence deletions generally range from about 1 to 30 
S residues, preferably about 1 to 10 residues, and are typically contiguous. Amino 
acid insertions include amino- and/or carboxyl-terminal fusions ranging in length 
from one to one hundred or more residues, as well as intrasequence insertions 
of single or multiple amino acid residues. Intrasequence insertions may range 
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues. 

10 Examples of terminal insertions include the heterologous signal sequences 
necessary for secretion or for intracellular targeting in different host cells. 

In a preferred method, polynucleotides encoding the novel nucleic 
acids are changed via site-directed mutagenesis. This method uses 
oligonucleotide sequences that encode the polynucleotide sequence of the 

15 desired amino acid variant, as well as a sufficient adjacent nucleotide on both 
sides of the changed amino acid to form a stable duplex on either side of the site 
of being changed. In general, the techniques of site-directed mutagenesis are 
well known to those of skill in the art and this technique is exemplified by 
publications such as, Edelman et al., DNA 2:183 (1983). A versatile and 

20 efficient method for producing site-specific changes in a polynucleotide 
sequence was published by Zollerand Smith, Nucleic Acids Res. 10:6487-6500 
(1982). 

PGR may also be used to create amino acid sequence variants of 
the novel nucleic acids. When small amounts of template DNA are used as 

25 starting material, primer(s) that differs slightly in sequence from the 
corresponding region in the template DNA can generate the desired amino acid 
variant. PGR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position 
specified by the primer. The product DNA fragments replace the corresponding 

30 region in the plasmid and this gives the desired amino acid variant. 
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A further technique for generating amino acid variants is the 
cassette mutagenesis technique described in Wells et al.. Gene 34:315 (1985); 
and other mutagenesis techniques well known in the art, such as, for example, 
the techniques In Sambrook et al., supra, and Current Protocols in Molecular 
S Biology, Ausubel et al. 

Due to the inherent degeneracy of the genetic code, other DNA 
sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning 
and expression of these novel nucleic acids. Such DNA sequences include 

10 those which are capable of hybridizing to the appropriate novel nucleic acid 
sequence under stringent conditions. 

Furthermore, knowledge of the DNA sequence provided by the 
present invention allows for the modification of cells to permit, or increase, 
expression of endogenous CD39-like polypeptides. Cells can be modified (e.g., 

15 by homologous recombination) to provide increased CD39-ljke expression by 
replacing, In whole or in part, the naturally occurring CD39-llke promoter with all 
or part of a heterologous promoter so that the cells express CD39-like 
polypeptides at a higher level. The heterologous promoter is inserted in such 
a manner that it is operatively linked to CD39-like encoding sequences. See, 

20 for example, PCT International Publication No. WO94/12650, PCT International 
Publication No. WO92/20808, and PCT International Publication No. 
WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene 
which encodes carbamyl phosphate synthase, aspartate transcarbamylase, and 

25 dihydroorotase) and/or intron DNA may be inserted along with the heterologous 
promoter DNA. If linked to the CD39-like coding sequence, amplification of the 
marker DNA by standard selection methods results in co-amplification of the 
CD39-like coding sequences in the cells. 

The polynucleotides of the present invention also make possible 

30 the development, through, e.g., homologous recombinantion or knock out 
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strategies, of animals that fail to express functional CD39-L4 or that express a 
variant of CD39-L4. Such animals are useful as models for studying the in vivo 
activities of CD39-L4 as well as for studying modulators of CD39-L4. 



5 6.8 Identification of Polymorphisms 

Polymorphisms can be identified in a variety of ways known in the 
art which all generally involve obtaining a sample from a patient, analyzing DNA 
from the sample, optionally involving isolation or amplification of the DNA, and 
identifying the presence of the polymorphism in the DNA. For example, PGR 

10 may be used to amplify an appropriate fragment of genomic DNA which may 
then be sequenced. Alternatively, the DNA may be subjected to allele-specific 
oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base 
mismatch) or to a single nucleotide extension assay (in which an oligonucleotide 

15 that hybridizes immediately adjacent to the position of the polymorphism is 
extended with one or more labelled nucleotides). In addition, traditional 
restriction fragment length polymorphism analysis (using restriction enzymes 
that provide differential digestion of the genomic DNA depending on the 
presence or absence of the polymorphism) may be performed. 

20 Alternatively, a polymorphism resulting in a change in the amino 

acid sequence could also be detected by detecting a corresponding change in 
amino acid sequence of the protein, e.g., by an antibody specific to the variant 
sequence. 

25 6.9 Hosts 

The present Invention further provides host cells containing SEQ 
ID NOs:1-2 of the present invention, wherein the nucleic acid has been 
introduced into the host cell using known transformation, transfection or infection 
methods. The host cell can be a higher eukaryotic host cell, such as a 
30 mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or the host 
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cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the 
recombinant construct into the host cell can be effected by calcium phosphate 
transfection, DEAE, dextran mediated transfection, or electroporation (Davis, L. 
et a!., Basic Methods in Molecular Biology (1986)), 
5 The host cells containing one of SEQ ID N0s:1-2 of the present 

Invention, can be used in conventional manners to produce the gene product 
encoded by the isolated fragment (in the case of an ORF) or can be used to 
produce a heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the 

10 ORFs of the present invention. These include, but are not limited to, eukaryotic 
hosts such as HeLa cells, Cv-1 cell, COS cells, and Sf9 cells, as well as 
prokaryotic host such as E. coli and B. subtilis. The most preferred cells are 
those which do not normally express the particular polypeptide or protein or 
which expresses the polypeptide or protein at low natural level. 

15 Mature proteins can be expressed in mammalian cells, yeast, 

bacteria, or other cells under the control of appropriate promoters. Cell-free 
translation systems can also be employed to produce such proteins using RNAs 
derived from the DNA constructs of the present invention. Appropriate cloning 
and expression vectors for use with prokaryotic and eukaryotic hosts are 

20 described by Sambrook, et al., in Molecular Cloning: A Laboratory Manual, 
Second Edition, Cold Spring Harbor, New York (1989), the disclosure of which 
is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to 
express recombinant protein. Examples of mammalian expression systems 

25 include the COS-7 lines of monkey kidney fibroblasts, described by Gluzman, 
Cell 23:175 (1981), and other cell lines capable of expressing a compatible 
vector, for example, the CI 27, 3T3, CHO, HeLa and BHK cell tines. Mammalian 
expression vectors will comprise an origin of replication, a suitable promoter and 
also any necessary ribosome binding sites, polyadenylation site, splice donor 

30 and acceptor sites, transcriptional termination sequences, and 5' flanking 
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nontranscribed sequences. DNA sequences derived from the SV40 viral 
genome, for example, SV40 origin, early promoter, enhancer, splice, and 
polyadenylation sites may be used to provide the required nontranscribed 
genetic elements. 

5 Recombinant polypeptides and proteins produced in bacterial 

culture are usually isolated by initial extraction from cell pellets, followed by one 
or more salting-out, aqueous ion exchange or size exclusion chromatography 
steps. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid 
10 chromatography (HPLC) can be employed for final purification steps. Microbial 
cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use 
of cell lysing agents. 

15 6.10 Peptides 

The present invention further provides isolated polypeptides 
encoded by the nucleic acid fragments of the present invention or by degenerate 
variants of the nucleic acid fragments of the present invention. By "degenerate 
variant" is intended nucleotide fragments which differ from a nucleic acid 

20 fragment of the present invention (e.g., an ORF) by nucleotide sequence but, 
due to the degeneracy of the Genetic Code, encode an identical polypeptide 
sequence. Preferred nucleic acid fragments of the present invention are the 
ORFs which encode proteins. 

A variety of methodologies known in the art can be utilized to 

25 obtain any one of the isolated polypeptides or proteins of the present invention. 
At the simplest level, the amino acid sequence can be synthesized using 
commercially available peptide synthesizers. This is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are 
useful, for example, in generating antibodies against the native polypeptide. In 

30 an alternative method, the polypeptide or protein is purified from bacterial cells 
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which naturally produce the polypeptide or protein. One skilled in the art can 
readily follow known methods for isolating polypeptides and proteins in order to 
obtain one of the isolated polypeptides or proteins of the present invention. 
These include, but are not limited to, immunochromatography, HPLC, 
5 size-exclusion chromatography, ion-exchange chromatography, and 
immuno-affinity chromatography. See, e.g.. Scopes, Protein Purification: 
Principles and Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular 
Cloning: A Laboratory Manual; Ausubel et al.. Current Protocols in Molecular 
Biology. 

10 The polypeptides and proteins of the present Invention can 

alternatively be purified from cells which have been altered to express the 
desired polypeptide or protein. As used herein, a cell is said to be altered to 
express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein vA\\dh it normally does 

IS not produce or which the cell normally produces at a lower level. One skilled in 
the art can readily adapt procedures for introducing and expressing either 
recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order 
to generate a cell which produces one of the polypeptides or proteins of the 
present invention. 

20 The purified polypeptides are used in in vitro binding assays which 

are well known in the art to identify molecules which bind to the polypeptides. 
These molecules include but are not limited to, for e.g., small molecules, 
molecules from combinatorial libraries, antibodies or other proteins. The 
molecules identified in the binding assay are then tested for antagonist or 

25 agonist activity in in vivo tissue culture or animal models that are well known in 
the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the 
animal/cells. 

In addition, the binding molecules may be complexed with toxins, 
30 e.g., ricin or cholera, or with other compounds that are toxic to cells. The 
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toxin-binding molecule complex is then targeted to the tumor or other cell by the 
specificity of the binding molecule for SEQ ID NOs:3-4. 

6.11 Gene Therapy 

5 Mutations in the CD39-like gene that result in loss of normal 

function of the CD39-like gene product underlie CD39-related human disease 
states. The invention comprehends gene therapy to restore CD39-like activity 
that would thus be indicated in treating those disease states. Delivery of a 
functional CD39-llke gene to appropriate cells is effected ex vivo, in situ, or in 

10 vivo by use of vectors, and more particuarly viral vectors (e.g., adenovirus, 
adeno-associated virus, or a retrovirus), or ex vivo by use of physical DNA 
transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no 6679, pp. 25-30 (1998). For 
additional reviews of gene therapy technology, see Friedmann, Science, 244: 

1 5 1 275-1 281 (1 989); Vemia. Scientific American: 68-84 (1 990); and Miller. Nature. 
357: 455-460 (1992). Alternatively, it is contemplated that in other human 
disease states, preventing the expression of or inhibiting the activity of CD39- 
like polypeptides will be useful in treating the disease states. It is contemplated 
that antisense therapy or gene therapy could be applied to negatively regulate 

20 the expression of CD39-like polypeptides. 

6.12 Antibodies 

In general, techniques for preparing polyclonal and monoclonal 
antibodies as well as hybridomas capable of producing the desired antibody are 
25 well known in the art (Campbell, A.M., Monoclonal Antibodies Technology: 
Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science 
Publishers, Amsterdam, The Netherlands (1984); St. Groth et al., J. Immunol. 
35:1-21 (1990); Kohler and Milstein, Nature 256:495-497 (1975)), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et al., Immunology 
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Today 4:72 (1983); Cole et al., in Monoclonal Antibodies and Cancer Therapy, 
Alan R. Liss, Inc. (1985), pp. 77-96). 

Any animal (mouse, rabbit, etc.) which is known to produce 
antibodies can be immunized with a peptide or polypeptide of the invention. 
Methods for immunization are well known in the art. Such methods include 
subcutaneous or intraperitoneal injection of the polypeptide. One skilled in the 
art will recognize that the amount of the protein encoded by the ORF of the 
present invention used for immunization will vary based on the animal which is 
immunized, the antigenicity of the peptide and the site of injection. 

The protein which is used as an immunogen may be modified or 
administered in an adjuvant in order to increase the protein's antigenicity. 
Methods of increasing the antigenicity of a protein are well known in the art and 
include, but are not limited to, coupling the antigen with a heterologous protein 
(such as globulin or -galactosidase) or through the inclusion of an adjuvant 
during immunization. 

For monoclonal antibodies, spleen cells from the immunized 
animals are removed, fused with myeloma cells, such as SP2/0-Ag14 myeloma 
cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used 
to identify the hybridoma cell which produces an antibody with the desired 
characteristics. These include screening the hybridomas with an ELISA assay, 
western blot analysis, or radioimmunoassay (Lutz et al., Exp. Cell Research. 
175:109-124(1988)). 

Hybridomas secreting the desired antibodies are cloned and the 
class and subclass is determined using procedures known in the art (Campbell, 
A.M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry 
and Molecular Biology, Elsevier Science Publishers, Amsterdam, The 
Netherlands (1984)). 
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Techniques described for the production of single chain antibodies 
(U.S. Patent 4,946,778) can be adapted to produce single chain antibodies to 
proteins of the present invention. 

For polyclonal antibodies, antibody containing antiserum is isolated 
5 from the immunized animal and is screened for the presence of antibodies with 
the desired specificity using one of the above-described procedures. 

The present invention further provides the above-described 
antibodies in delectably labeled form. Antibodies can be delectabiy labeled 
through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), 

10 enzymatic labels (such as horseradish peroxidase, alkaline phosphatase, etc.) 
fluorescent labels (such as FITC or rhodamine, etc.), paramagnetic atoms, etc. 
Procedures for accomplishing such labeling are well-known in the art, for 
example, see (Stemberger, L.A. et al., J. Histochem. Cytochem. 18:315 (1970); 
Bayer, E.A. et al., Meth. Enzym. 62:308 (1979); Engval, E. et al., Immunol. 

15 109:129 (1972); Goding, J.W. J. Immunol. Meth. 13:215 (1976)). 

The labeled antibodies of the present invention can be used for in 
vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment 
of the polypeptide of interest is expressed. The antibodies may also be used 
directly in therapies or other diagnostics. 

20 The present invention further provides the above-described 

antibodies immobilized on a solid support. Examples of such solid supports 
include plastics sud^ as polycarbonate, complex carbohydrates such as agarose 
and sepharose, acrylic resins and such as polyacrylamide and latex beads. 
Techniques for coupling antibodies to such solid supports are well known in the 

25 art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed., 
Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, 
W.D. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The immobilized 
antibodies of the present invention can be used for in vitro, in vivo, and in situ 
assays as well as for immuno-affinity purification of the proteins of the present 

30 invention. 
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6.13 Computer Readable Sequences 

In one application of this embodiment, a nucleotide sequence of 
the present invention can be recorded on computer readable media. As used 
herein, "computer readable media" refers to any medium which can be read and 
5 accessed directly by a computer. Such media include, but are not limited to: 
magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media 
such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. A skilled artisan can readily appreciate how 

10 any of the presently known computer readable mediums can be used to create 
a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing 
information on computer readable medium. A skilled artisan can readily adopt 

15 any of the presently known methods for recording information on computer 
readable medium to generate manufactures comprising the nucleotide sequence 
information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating 
a computer readable medium having recorded thereon a nucleotide sequence 

20 of the present invention. The choice of the data storage structure will generally 
be based on the means chosen to access the stored information. In addition, a 
variety of data processor programs and formats can be used to store the 
nucleotide sequence infomiation of the present invention on computer readable 
medium. The sequence information can be represented in a word processing 

25 text file, formatted in commercially-available software such as WordPerfect and 
Microsoft Word, or represented in the form of an ASCII file, stored in a database 
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 
readily adapt any number of dataprocessor structuring formats (e.g. text file or 
database) in order to obtain computer readable medium having recorded 

30 thereon the nucleotide sequence information of the present invention. 
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By providing the nucleotide sequence of SEQ ID NOs:1-2, a 
representative fragment thereof, or a nucleotide sequence at least 99.9% 
identical to SEQ ID N0s:1-2 in computer readable form, a skilled artisan can 
routinely access the sequence information for a variety of purposes. Computer 
S software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which 
follow demonstrate how software which implements the BLAST (Altschul et al., 
J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 
1 7:203-207 (1 993)) search algorithms on a Sybase system is used to identify 

lb open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 
be protein encoding fragments and may be useful in producing commercially 
important proteins such as enzymes used in fermentation reactions and in the 
production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware 

15 means, software means, and data storage means used to analyze the nucleotide 
sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central 
processing unit (CPU), input means, output means, and data storage means. 
A skilled artisan can readily appreciate that any one of the currently available 

20 computer-based systems are suitable for use in the present invention. 

As stated above, the computer-based systems of the present 
invention comprise a data storage means having stored therein a nucleotide 
sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used 

25 herein, "data storage means" refers to memory which can store nucleotide 
sequence information of the present invention, or a memory access means 
which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs 

30 which are implemented on the computer-based system to compare a target 
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sequence or target structural motif with the sequence information stored within 
the data storage means. Search means are used to identify fragments or 
regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of 
S commercially available software for conducting search means are and can be 
used in the computer-based systems of the present invention. Examples of such 
software includes, but is not limited to, MacPattern (EMBL), BLASTN and 
BLASTA (NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any 
one of the available algorithms or implementing software packages for 

10 conducting homology searches can be adapted for use in the present 
computer-based systems. 

As used herein, a "target sequence" can be any nucleic acid or 
amino acid sequence of six or more nucleotides or two or more amino acids. A 
skilled artisan can readily recognize that the longer a target sequence is, the 

15 less likely a target sequence will be present as a random occurrence in the 
database. The most preferred sequence length of a target sequence is from 
about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. 
However, it is well recognized that searches for commercially important 
fragments, such as sequence fragments involved in gene expression and protein 

20 processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers 
to any rationally selected sequence or combination of sequences in which the 
sequence(s) are chosen based on a three-dimensional configuration which is 
formed upon the folding of the target motif. There are a variety of target motifs 

25 known in the art. Protein target motifs include, but are not limited to, enzyme 
active sites and signal sequences. Nucleic acid target motifs include, but are 
not limited to, promoter sequences, hairpin structures and inducible expression 
elements (protein binding sequences). 

30 
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6.14 Expression Modulating Sequences 

EMF sequences can be identified within a genome by their 
proximity to the ORFs. An intergenic segment, or a fragment of the intergenic 
segment, from about 10 to 200 nucleotides in length, taken 5' from any ORF will 
S modulate the expression of an operably linked 3' ORF in a fashion similar to that 
found with the naturally linked ORF sequence. As used herein, an "intergenic 
segment" refers to the fragments of a genome which are between two ORF(S) 
herein described. Alternatively, EMFs can be identified using known EMFs as 
a target sequence or target motif in the computer-based systems of the present 
10 invention. 

The presence and activity of an EMF can be confirmed using an 
EMF trap vector. An EMF trap vector contains a cloning site 5' to a marker 
sequence. A marker sequence encodes an identifiable phenotype, such as 
antibiotic resistance or a complementing nutrition auxotrophic factor, which can 

15 be identified or assayed when the EMF trap vector is placed within an 
appropriate host under appropriate conditions. As described above, an EMF will 
modulate the expression of an operably linked marker sequence. A more 
detailed discussion of various marker sequences is provided below. 
A sequence which is suspected as being an EMF is cloned in all three reading 

20 frames in one or more restriction sites upstream from the marker sequence in 
the EMF trap vector. The vector is then transformed into an appropriate host 
using known procedures and the phenotype of the transformed host is examined 
under appropriate conditions. As described above, an EMF will modulate the 
expression of an operably linked marker sequence. 

25 

6.15 Triplex Helix Formation 

In addition, the fragments of the present invention, as broadly 
described, can be used to control gene expression through triple helix formation 
or antisense DNA or RNA, both of which methods are based on the binding of 
30 a polynucleotide sequence to DNA or RNA. Polynucleotides suitable for use in 
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these methods are usually 20 to 40 bases in length and are designed to be 
complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 
(1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA itself 
(antisense - Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). 

Triple helix- formation optimally results in a shut-off of RNA 
transcription from DNA, \while antisense RNA hybridization blocks translation of 
an mRNA molecule into polypeptide. Both techniques have been demonstrated 
to be effective in model systems. Information contained in the sequences of the 
present Invention is necessary for the design of an antisense or triple helix 
oligonucleotide. 

6.16 Diagnostic Assays and Kits 

The present invention further provides methods to identify the 
expression of one of the ORFs of the present invention, or homolog thereof, in 
a test sample, using a nucleic acid probe or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one 
or more of the antibodies or one or more of nucleic acid probes of the present 
invention and assaying for binding of the nucleic acid probes or antibodies to 
components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a 
test sample vary. Incubation conditions depend on the format employed in the 
assay, the detection methods employed, and the type and nature of the nucleic 
acid probe or antibody used in the assay. One skilled in the art will recognize 
that any one of the commonly available hybridization, amplification or 
immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be 
found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); 
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Bullock, G.R. et al., Techniques in Immunocytochemistry, Academic Press, 
Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and 
Theory of Immunoassays: Laboratory Techniques in Biochemistry and 
Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
5 (1985). 

The test samples of the present invention include cells, protein or 
membrane extracts of cells, or biological fluids such as sputum, blood, serum, 
plasma, or urine. The test sample used in the above-described method will vary 
based on the assay format, nature of the detection method and the tissues, cells 
10 or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be 
readily be adapted in order to obtain a sample which is compatible with the 
system utilized. 

In another embodiment of the present invention, kits are provided 
IS which contain the necessary reagents to carry out the assays of the present 
invention. 

Specifically, the invention provides a compartment kit to receive, 
in close confinement, one or more containers which comprises: (a) a first 
container comprising one of the probes or antibodies of the present invention; 
20 and (b) one or more other containers comprising one or more of the following: 
wash reagents, reagents capable of detecting presence of a bound probe or 
antibody. 

In detail, a compartment kit includes any kit in which reagents are 
contained in separate containers. Such containers include small glass 

25 containers, plastic containers or strips of plastic or paper. Such containers 
allows one to efficiently transfer reagents from one compartment to another 
compartment such that the samples and reagents are not cross-contaminated, 
and the agents or solutions of each container can be added in a quantitative 
fashion from one compartment to another. Such containers will include a 

30 container which will accept the test sample, a container which contains the 
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antibodies used in the assay, containers which contain wash reagents (such as 
phosphate buffered saline, Tris-buffers, etc.), and containers which contain the 
reagents used to detect the bound antibody or probe. 

Types of detection reagents include labeled nucleic acid probes, 
5 labeled secondary antibodies, or in the alternative, If the primary antibody is 
labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize 
that the disclosed probes and antibodies of the present invention can be readily 
incorporated into one of the established kit formats which are well known in the 
10 art. 

6.17 Screening Assays 

Using the isolated proteins of the present invention, the present 
invention further provides methods of obtaining and identifying agents which 
15 bind to a protein encoded by one of the ORFs from a nucleic acid with a 
sequence of one of SEQ ID N0s:1-2, or to a nucleic acid with a sequence of one 
of SEQ ID N0s:1-2. 

In detail, said method comprises the steps of: (a) contacting an 
agent with an isolated protein encoded by one of the ORFs of the present 
20 invention, or nucleic acid of the invention; and (b) determining whether the 
agent binds to said protein or said nucleic acid. 

The agents screened in the above assay can be, but are not 
limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical 
agents. The agents can be selected and screened at random or rationally 
25 selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, 
pharmaceutical agents and the like are selected at random and are assayed for 
their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As 
30 used herein, an agent is said to be "rationally selected or designed" when the 
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agent is chosen based on the configuration of the particular protein. For 
example, one skilled in the art can readily adapt currently available procedures 
to generate peptides, pharmaceutical agents and the like capable of binding to 
a specific peptide sequence in order to generate rationally designed antipeptlde 
5 peptides, for example see Hurby et al., Application of Synthetic Peptides: 
Antisense Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY 
(1992), pp. 289-307, and Kaspczak et al., Biochemistry 28:9230-8 (1989), or 
pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present 

10 invention, as broadly described, can be used to control gene expression through 
binding to one of the ORFs or EMFs of the present invention. As described 
above, such agents can be randomly screened or rationally designed/selected. 
Targeting the ORF or EMF allows a skilled artisan to design sequence specific 
or element specific agents, modulating the expression of either a single ORF or 

15 multiple ORFs which rely on the same EMF for expression control. 

One class of DNA binding agents are agents which contain base 
residues which hybridize or form a triple helix formation by binding to DNA or 
RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid 
backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 

20 base attachment capacity. 

Agents suitable for use in these methods usually contain 20 to 40 
bases and are designed to be complementary to a region of the gene involved 
in transcription (triple helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); 
Cooney et al.. Science 241:456 (1988); and Dervan et al., Science 251:1360 

25 (1 991 )) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 (1 991 ); 
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of 
RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Both techniques have been 

30 demonstrated to be effective in model systems. Information contained in the 
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sequences of the present invention is necessary for the design of an antisense 
or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be used as a diagnostic agent, in the control of bacterial 
infection by modulating the activity of the protein encoded by the ORF. Agents 
which bind to a protein encoded by one of the ORFs of the present invention can 
be formulated using known techniques to generate a pharmaceutical 
composition. 

6.18 Use of Nucleic Acids as Probes 

Another aspect of the subject invention is to provide for 
polypeptide-specific nucleic acid hybridization probes capable of hybridizing with 
naturally occurring nucleotide sequences. The hybridization probes of the 
subject invention may be derived from the nucleotide sequence of the SEQ ID 
N0s:1-2. Because the corresponding gene is expressed In only one out of 18 
tissues tested, namely macrophages, a hybridization probe derived from SEQ 
ID N0s:1-2 can be used as an indicator of the presence of macrophage RNA in 
a sample. Any suitable hybridization technique can be employed, such as, for 
example, in situ hybridization. 

PGR as described US Patent Nos 4,683,195 and 4,965,188 
provides additional uses for oligonucleotides based upon the nucleotide 
sequences. Such probes used in PGR may be of recombinant origin, may be 
chemically synthesized, or a mixture of both. The probe will comprise a discrete 
nucleotide sequence for the detection of identical sequences or a degenerate 
pool of possible sequences for identification of closely related genomic 
sequences. 

Other means for producing specific hybridization probes for nucleic 
acids include the cloning of nucleic acid sequences into vectors for the 
production of mRNA probes. Such vectors are known in the art and are 
commercially available and may be used to synthesize RNA probes in vitro by 
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means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. 

The nucleotide sequences may be used to construct hybridization 
probes for mapping their respective genomic sequences. The nucleotide 
5 sequence provided herein may be mapped to a chromosome or specific regions 
of a chromosome using well known genetic and/or chromosomal mapping 
techniques. These techniques include in situ hybridization, linkage analysis 
against known chromosomal markers, hybridization screening with libraries or 
flow-sorted chromosomal preparations specific to known chromosomes, and the 

10 like. The technique of fluorescent in situ hybridization of chromosome spreads 
has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 
Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map 

15 data. Examples of genetic map data can be found in the 1 994 Genome Issue 
of Science (265:1 981 f). Correlation between the location of a nucleic acid on 
a physical chromosomal map and a specific disease (or predisposition to a 
specific disease) may help delimit the region of DNA associated with that genetic 
disease. The nucleotide sequences of the subject invention may be used to 

20 detect differences in gene sequences between normal, carrier or affected 
individuals. 

The nucleotide sequence may be used to produce purified 
polypeptides using well known methods of recombinant DNA technology. 
Among the many publications that teach methods for the expression of genes 

25 after they have been isolated is Goeddel (1990) Gene Expression Technology, 
Methods and Enzymology, Vol 185, Academic Press, San Diego. Polypeptides 
may be expressed in a variety of host cells, either prokaryotic or eukaryotic. 
Host cells may be from the same species from which a particular polypeptide 
nucleotide sequence was isolated or from a different species. Advantages of 

30 producing polypeptides by recombinant DNA technology include obtaining 
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adequate amounts of the protein for purification and the availability of simplified 
purification procedures. 

Each sequence so obtained was compared to sequences in 
GenBank using a search algorithm developed by Applied Biosystems and 
incorporated into the INHERIT™ 670 Sequence Analysis System. In this 
algorithm, Pattern Specification Language (developed by TRW Inc., Los 
Angeles, CA) was used to determine regions of homology. The three 
parameters that determine how the sequence comparisons run were window 
size, window offset, and error tolerance. Using a combination of these three 
parameters, the DNA database was searched for sequences containing regions 
of homology to the query sequence, and the appropriate sequences were scored 
with an initial value. Subsequently, these homologous regions were examined 
using dot matrix homology plots to distinguish regions of homology from chance 
matches. Smith-Waterman alignments were used to display the results of the 
homology search. 

Peptide and protein sequence homologies were ascertained using 
the INHERIT™ 670 Sequence Analysis System in a way similar to that used in 
DNA sequence homologies. Pattern Specification Language and parameter 
windows were used to search protein databases for sequences containing 
regions of homology which were scored with an initial value. Dot-matrix 
homology plots were examined to distinguish regions of significant homology 
from chance matches. 

Alternatively, BLAST, which stands for Basic Local Alignment 
Search Tool, is used to search for local sequence alignments (Altschul SF 
(1993) J Mol Evol 36:290-300; Altschul, SF et at (1990) J Mol Biol 215:403-10). 
BLAST produces alignments of both nucleotide and amino acid sequences to 
determine sequence similarity. Because of the local nature of the alignments, 
BLAST is especially useful in determining exact matches or in identifying 
homologs. Whereas it is ideal for matches which do not contain gaps, it is 
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inappropriate for performing motif-style searching. The fundamental unit of 
BLAST algorithm output is the High-scoring Segment Pair (HSP). 

An HSP consists of two sequence fragments of arbitrary but equal 
lengths whose alignment is locally maximal and for which the alignment score 
meets or exceeds a threshold or cutoff score set by the user. The BLAST 
approach is to look for HSPs between a query sequence and a database 
sequence, to evaluate the statistical significance of any matches found, and to 
report only those matches which satisfy the user-selected threshold of 
significance. The parameter E establishes the statistically significant threshold 
for reporting database sequence matches. E is interpreted as the upper bound 
of the expected frequency of chance occurrence of an HSP (or set of HSPs) 
within the context of the entire database search. Any database sequence whose 
match satisfies E is reported in the program output. 

In addition, BLAST analysis was used to search for related 
molecules within the libraries of the LIFESEQ™ database. This process, an 
"electronic northern" analysis is analogous to northern blot analysis in that it 
uses one cellubrevin sequence at a time to search for identical or homologous 
molecules at a set stringency. The stringency of the electronic northern is based 
on "product score". The product score is defined as (% nucleotide or amino acid 
[between the query and reference sequences] in Blast multiplied by the % 
maximum possible BLAST score [based on the lengths of query and reference 
sequences]) divided by 100. At a product score of 40, the match will be exact 
within a 1-2% error; and at 70, the match will be exact. Homologous or related 
molecules can be identified by selecting those which show product scores 
between approximately 15 and 30. 

6.19 SEQ ID NOs:1-8 

Referring to Figure 1, SEQ ID N0:1 is the nucleotide sequence of 
an expressed sequence tag corresponding to a polynucleotide isolated from a 
cDNA library of human fetal liver-spleen. SEQ ID N0:2 is an extended version 
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of SEQ ID N0:1 obtained as described in Example 34, and the encoded 
polypeptide in SEQ ID NO: 3 is referred to herein as CD39-L66.. SEQ ID NO:2 
encodes a polypeptide having the amino acid sequence of SEQ ID N0:3 (shown 
in Figure 2). The open reading frame corresponding to SEQ ID N0:3 starts at 
nucleotide 246, as numbered from the 5' end of SEQ ID N0:2. This open 
reading frame encodes a polypeptide 428 amino acids in length. The estimated 
molecular weight of the unglycosylated polypeptide is approximately 47.52 kDa. 

Protein database searches with the BLAST algorithm indicate that 
SEQ ID NO:3 is homologous to the CD39 family. Figure 3 shows the amino acid 
sequence alignment between SEQ ID N0:3 (identified as "246 prot") and human 
CD39 ("CD39Human.seq"), indicating that the two sequences share 30% amino 
acid sequence identity. Moreover, a higher degree of homology between the 
apyrase conserved regions (Kaczmarek et al., J. Biol. Chem. 271:33116-33122 
(1996) is observed. In particular, an almost perfect match to a putative 
ATP-binding region was found from amino acids 54-58, DAGST (DAGSS in 
CD39). In addition, the DLGGASTQ motif (DLGGASTQ in CD39), which is very 
well conserved among ATPDases, is found from amino acids 199-206 in SEQ 
ID N0:3. Other regions conserved in apyrases were found from amino acids 
129-134, ATAGLR (ATAGMR in CD39) and from amino acids 169-173, GSDEG 
(GQEEG in CD39). 

SEQ ID NO:3 differs from CD39 in that SEQ ID N0:3 contains a 
hydrophobic stretch of 22 amino acids at its amino terminus, which is indicative 
of a leader peptide. SEQ ID N0:3 also lacks the transmembrane domain found 
at the carboxyl terminus of CD39. These features indicate that SEQ ID N0:3 is 
a soluble ATPDase. 

SEQ ID N0:3 shares an even higher degree of homology (83% 
identity) with a murine NTPase, as shown in the amino acid sequence alignment 
presented in Fig. 4 (SEQ ID N0:3 is identified as "246 prot," and mouse CD39 
as "mur ntpase"). 
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The message encoding SEQ ID N0:3 is tightly regulated in a 
tissue-specific manner. An expression study using a semiquantatative 
PCR/Southern blot approach revealed a significant level of expression in 
macrophage. In contrast, human CD39 is expressed in tissues such as 
S placenta, lung, skeletal muscle, kidney, and heart. 

SEQ ID NO: 4 is the polynucleotide sequence for CD39-L4, 
described in Chadwick et a!., Genomics, 50(3):357-67 (1998); SEQ ID NO: 5 is 
the corresponding amino acid sequence. 

SEQ ID NO: 6 is the polynucleotide sequence for a CD39-L4 
10 variant designated ACRIII, wherein the following amino acid substitutions have 
been made: D168-T, S170-Q and L175-F; SEQ ID NO: 7 is the corresponding 
amino acid sequence. 

SEQ ID NO: 8 is the genomic sequence for the human CD39-L4 
gene; exons appear at nucleotides 1-288 (exon 1), 1281-1580 (exon 2), 1820- 
15 1 855 (exon 3) 2467-2555 (exon 4), 2863-2942 (exon 5). 3889-3950 (exon 6), 
4894-4995 (exon 7), 5847-5987 (exon 8). 6966-7138 (exon 9) and 8556-9365 
(exon 10). 



6.20 Uses of Novel CD39-Like Polypeptides and Antibodies 

20 Polypeptides of the invention having ATPDase activity are useful 

for inhibiting platelet function and can therefore be employed in the prophylaxis 
or treatment of pathological conditions caused by or involving thrombosis or 
excessive coagulation or excessive platelet aggregation, such as myocardial 
infarction, cerebral ischemia, angina, and the like. Polypeptides of the invention 

25 can also be used in the maintenance of vascular grafts. Platelet function can be 
measured by any of a number of standard assays, such as, for example, the 
platelet aggregation assay described in Example 5. 

Such pathological conditions include conditions caused by or 
involving arterial thrombosis, such as coronary artery thrombosis and resulting 

30 myocardial infarction, cerebral artery thrombosis or intracardiac thrombosis (due 
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to, e.g., atrial fibrillation) and resulting stroke, and other peripheral arterial 
thrombosis and occlusion; conditions associated with venous thrombosis, such 
as deep venous thrombosis and pulmonary embolism; conditions associated 
with exposure of the patient's blood to a foreign or Injured tissue surface, 
including diseased heart valves, mechanical heart valves, vascular grafts, and 
other extracorporeal devices such as intravascular cannulas, vascular access 
shunts in hemodialysis patients, hemodialysis machines and cardiopulmonary 
bypass machines; and conditions associated with coagulapathies, such as 
hypercoagulability and disseminated intravascular coagulopathy. Co- 
administration of other agents suitable for treating the pathological condition, 
e.g., other anti-coagulation agents, is also contemplated. 

In particular, variants like the ACRIII mutant described herein are 
expected to be superior therapeutics for treating such pathological conditions 
because (1 ) ACRIII exhibits six-fold greater activity compared to wild type CD39- 
L4, and (2) ACRIII, like CD39-L4, is uniquely specific for ADP and does not 
hydrolyze ATP. Thus, adverse side effects from hydrolysis of circulating ATP 
are avoided. 

For instance, ATP is known to act as an extracellular signal in 
many tissues. In the heart, extracellular ATP modulates ionic processes and 
contractile function (for review see Burnstock, G., Neuropharmacology 36:1 127). 
Recently, it has been shown that extracellular ATP markedly inhibits glucose 
transport in rat cardiomyocytes (Fisher Y. et al., J. Biol. Chem. 274:755-761. 
Another source of extracellular ATP is that released from parenchymal cells 
under hypoxic or ischemic conditions (Skobel, E., and Kammermeier, H. 
Biochim. Biophys. Acta 1362:128-134). ATP is also Involved in the modulation 
of anti-lgE-induced release of histamine from human lung mast cells (Schulman 
E. S., et al., Am. J. Respir. Cell Mol. Biol. 20:520-537). 

Furthermore, the ability of CD39-L4 to hydrolyze NDPs other than 
ADP has implications outside the circulatory system. For instance, it has been 
reported that UDP is the most potent agonist for the human P2Y6 receptor. 
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Communi, et al., Bioch Bioph Res Com 222:303-308 (1996). This receptor is 
expressed in several tissues including infiltrating T cells present in inflammatory 
bowel disease. Somers, et al., Lab Invest 78:1375-1383 (1998). In this 
microenvironment, a molecule with the enzymatic properties of CD39-L4 could 
5 influence T cell responses by modifying the extracellular half-life of UDP. 
Another role for CD39-L4 has been suggested by the report that mouse CD39- 
L4 maps closely to a locus associated with audigenic brain seizures in mice. 
See Chadwick, et al., Genomics 50:357-367 (1998); Seyfried, etal.. Genetics 
99:1 17-126 (1981). This locus, known as Asp-1, is thought to be linked or to 

10 correspond to a factor that influences Ca^*-ATPase activity. Neumann, et al., 
Behav; Genetics 20:307-323 (1990). 

Additionally, the polypeptides of the invention can be used as 
molecular weight markers, and as a food supplement. A polypeptide consisting 
of SEQ ID N0:3, for example, has a molecular mass of approximately 47.52 kD 

15 In its unglycosylated form. Protein food supplements are well known and the 
formulation of suitable food supplements including polypeptides of the invention 
is within the level of skill in the food preparation art. 

The polypeptides of the invention are also useful for making 
antibody substances that are specifically immunoreactive with CD39-like 

20 proteins. Antibodies and portions thereof (e.g.. Fab fragments) which bind to the 
polypeptides of the invention can be used to identify the presence of such 
polypeptides in a sample. For example, the level of the native protein 
corresponding to SEQ ID N0:3 in a blood sample can be determined as an 
indication of vascular condition. Such determinations are carried out using any 

25 suitable immunoassay format, and any polypeptide of the invention that is 
specifically bound by the antibody can be employed as a positive control. 

The polypeptides of the invention are administered by any route 
that delivers an effective dosage to the desired site of action. The determination 
of a suitable route of administration and an effective dosage for a particular 

30 indication is within the level of skill in the art. For treatment of vascular disease, 
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polypeptides according to the invention are generally administered 
intravenously. In vivo murine studies with soluble human CD39 have shown that 
mice injected Intravenously with 50 mg recombinant soluble human CD39 in 100 
ml sterile saline had biologically active 0039 in their sera for an extended period 
5 of time, with an elimination half-life of almost 2 days (Gayle, R.B., et al., J. 
Clinical Invest. 101:1851-1859 (1998)). Suitable dosage ranges for the 
polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as 
necessary by the clinician to provide maximal therapeutic benefit. 

10 The present invention is illustrated in the following examples. 

Upon consideration of the present disclosure, one of skill in the art will 
appreciate that many other embodiments and variations may be made in the 
scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following 

15 examples. 

EXAMPLE 1 

Isolation of SEQ ID NO:1 from a cDNA Library of Human Fetal 
Liver-Spleen 

20 A plurality of novel nucleic acids were obtained from a 

b2HFLS20W cDNA library prepared from human fetal liver-spleen, as described 
in Bonaldo et al., Genome Res. 6:791-806 (1996), using standard PGR, SBH 
sequence signature analysis, and Sanger sequencing techniques. The inserts 
of the library were amplified with PGR using primers specific for vector 

25 sequences flanking the inserts. These samples were spotted onto nylon 
membranes and interrogated with oligonucleotide probes to give sequence 
signatures. The clones were clustered into groups of similar or identical 
sequences, and single representative clones were selected from each group for 
gel sequencing. The 5' sequence of the amplified inserts was then deduced 

30 using the reverse Ml 3 sequencing primer in a typical Sanger sequencing 
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protocol. PGR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single-pass gel sequencing was done using a 377 
Applied Biosystems (ABI) sequencer. One of these inserts was identified as a 
novel sequence not previously obtained from this library and not previously 
S reported in public databases. This sequence is shown in Figure 1 as SEQ ID 
N0:1. 

EXAMPLE 2 

Isolation of SEQ ID NO:2 and Determination of a Nucleotide Sequence 
10 Encoding a 428-Amino Acid Protein with Sequence Homology to CD39 

The nucleotide sequence shown in Figure 1 , and labeled SEQ ID 
N0:2, encodes the translated amino acid sequence SEQ ID NO:3, which is 
shown in Figure 2. The extended nucleotide sequence was obtained by 

15 isolating colonies generated from pools of clones from a human macrophage 
cDNA library (Invitrogen, Cat. # A550-25). Briefly, the macrophage cDNA library 
was plated on LB/Amp plates (containing 100 mg/ml ampicillin) at a density of 
about 40,000 colonies/plate. The colonies were lifted onto nitrocellulose filters 
and hybridized with a radiolabeled probe generated from the original clone (i.e., 

20 SEQ ID NO: 1). 

That the identified clones corresponded to SEQ ID N0s:1 and 2 
was . confirmed by using gene-specific primers 
(5'-GCTACCTCACTTCCTTTGAG-3' [SEQ ID NO: 9] and 
5'-CTGGCTGGTGAAGTTTTCCTC-3' [SEQ ID NO: 10]) in a PCR-based assay. 

25 Then PGR using vector- and gene-specific primers was employed to amplify the 
5' portion of the cDNA. Nested primers were used to generate sequence from 
the amplified product(s). Laser gene™ software was used to edit and "contig" 
the partial sequences into a full-length sequence. As discussed above, the 
amino acid sequence has striking homology to CD39, which is involved in 

30 modulating platelet reactivity during vascular inflammation. Based in part on the 
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observed sequence similarity to CD39, the polypeptide encoded by SEQ ID NO: 
2 was designated CD39-L66. 

EXAMPLE 3 

5 Expression Study Using SEQ ID NO:2 

The expression of SEQ ID N0:2 in various tissues was analyzed 
using a semi-quantitative polymerase chain reaction-based technique. Human 
cDNA libraries were used as sources of expressed genes from tissues of interest 
(adult brain, adult heart, adult kidney, adult lymph node, adult liver, adult lung, 

10 adult ovary, adult placenta, adult spleen, adult testis, bone marrow, fetal kidney, 
fetal liver, fetal liver-spleen, fetal skin, fetal brain, fetal leukocyte and 
macrophage). Gene-specific primers (5'-GCTACCTCACTTCCTTTGAG-3' 
[SEQ ID NO: 9] and 5'-GCAGGTCTCCAAGGAAGTACG-3' [SEQ ID NO: 11]) 
were used to amplify portions of the SEQ ID N0:2 sequence from the samples. 

15 Amplified products were separated on an agarose gel, transferred and 
chemically linked to a nylon filter. The filter was then hybridized with a 
radioactively labeled (a'^P-dCTP) double-stranded probe generated from the 
full-length SEQ ID N0:2 sequence using a Klenow polymerase, random-prime 
method. The filters were washed (high stringency) and used to expose a 

20 phosphorimaging screen for several hours. Bands indicated the presence of 
cDNA including SEQ ID N0:2 sequences in a specific library, and thus mRNA 
expression in the corresponding cell type or tissue. 

Of the 18 human tissues tested, macrophage was the only sample 
that provided a signal, indicating that expression of SEQ ID N0:2 is tightly 

25 regulated. In contrast, the CD39 molecule has been found in tissues such as 
placenta, lung, skeletal muscle, kidney and heart. 
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EXAMPLE 4 

Chromosomal Localization of the Gene Con'esponding to SEQ 10 NOs:1 and 2 

Chromosome mapping technologies allow investigators to link 
genes to specific regions of chromosomes. Assignment to chromosome 14 was 
5 performed with the Coriell cell repository monochromosomal panel #2 (NIGMS 
cell repository). This human rodent somatic cell hybrid panel consists of DNA 
isolated from 24 hybrid cell cultures retaining 1 human chromosome each. The 
panel was screened with gene-specific primers 
(5'-GCTACCTCACTTCCTTTGAG-3' [SEQ ID NO: 9] and 
10 5'-CTGGCTGGTGAAGTTTTCCTC-3' [SEQ ID NO: 10]) that generated a 
sequence tag site (STS). The Genebridge 4 radiation hybrid panel was also 
screened (Research Genetics), and the results of the PGR screening were 
submitted to the Whitehead/MIT Radiation Hybrid mapping email server at 
http://www-genome.wi.mit.edu. 

15 

EXAMPLE 5 
Platelet Aggregation Assay 

Blood is anticoagulated with 0.1 volume 3.2% sodium citrate. 
Platelet-rich plasma (PRP) is prepared with an initial whole blood centrifugation 

20 (200 X g, 15 min., 25°C) and a second centrifugation of the PRP (90 x g, 10 
min.) to eliminate residual erythrocytes and leukocytes. The stock suspension 
of PRP is maintained at room temperature under 6% COj-air. The platelet 
aggregation assay uses a two-sample, four-channel Whole Blood 
Lumi-Aggregometor, model 560 (Chronolog Corp., Havertown, PA). PRP 

25 containing 1 .22 x 10^ platelets is preincubated with the sample to be tested for 
inhibition of aggregation for 10 min. at ZTC in a siliconized glass cuvette 
containing a stirring bar, followed by stimulation with either ADP (5 mm), 
collagen (5 mg/ml), or thrombin (0.1 unit/ml). Platelet aggregation is recorded 
for at least 10 min. Data are expressed as the percentage of light transmission 

30 with platelet-poor plasma equal to 1 00%. 



51 



Example 6 
CD39-L4 is a soluble apyrase 
The mammalian ectoapyrase CD39 is an integral membrane 
protein with two transmembrane domains (one at each end of the protein) 
5 (Maliszewski, C. R. et al., J. Immunol. 153:3574-3583), The hydrophobicity 
profiles for the deduced amino acid sequence of other family members, such as 
CD39L1 and CD39L3, are very similar to CD39 (Chadwick B. P. and Frischauf 
A-M.; Genomics 50:357-367), suggesting that these proteins also have two 
membrane spanning domains. However, CD39-L4 does not appear to have a 

10 second transmembrane domain at its C-terminus, suggesting that the N-terminus 
hydrophobic region could code for a secretory signal. To test this hypothesis, 
CD39-L4 was subcloned into the mammalian expression vector pCDNA3.1 and 
a 6-Histidine tag was inserted into the coding sequence. 

The CD39-L4 cDNA sequence was initially isolated from a 

15 macrophage cDNA library (Invitrogen). The sense primer (5'- 
TTAAAGCTTGGGAAAAGAATGGCCACTTC-3', SEQ ID NO. 20) with a Hindlll 
site and the antisense primer (5'-AGACTCGAGGTGGCTCAATGGGAGATGCC- 
3', SEQ ID NO. 21) with a Xhol site were used to subclone the coding 
sequences into the mammalian expression vector pcDNA3.1 (Invitrogen). The 

20 nucleotide sequence of the insert is set forth in SEQ ID NO. 4. In order to 
immunologically detect the protein, the coding region was further modified so 
that it would include a Gly-Ser-6His epitope tag immediately following Arg^^ 
Briefly, two partially overlapping complementary oligonucleotides (5'- 
GCGCTGTCTCCCACAGAGGATCGCATCACCATCACCATCACAACCAGCA 

25 GACTTGGTT-3' (SEQ ID. NO. 22) and 5'- 
AACCAAGTCTGCTGGTTGTGATGGTGATGGTGATGCGATCCTCTGTGGG 
AGACAGCGC-3' (SEQ ID NO. 23)) were used on the CD39-L4 pcDNA3.1 
template. The primers v^rere extended In opposite directions around the plasmid 
using a 12 cycle PGR program (95X, 1 minute; SCC, 1 minute; 72°C, 15 

30 minutes) (Stratagene). The reaction was treated with Dpnl to digest the 
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methylated parental DNA and then transformed into E. coli. Colonies were 
screened for the insert. 

To ascertain whether CD39-L4-6His is secreted, the coding region 
of the CD39-L4-6His protein was inserted into the pcDNA3.1 expression vector 
and transiently transfected into COS-7 cells. Cos-7 cells obtained from the 
American Tissue Type Culture Collection were grown in DMEM supplemented 
with 10% FBS and 100 units/ml penicillin G and 100 fzg/m\ streptomycin sulfate 
at 37°C in 10% COj. Transfections were performed at 75% confluency in 10cm 
plates with Fugene-6 according to the manufacturers instructions. The cells in 
7 mis of medium were incubated with 16 jul of Fugene-6 and 8 ^g of DNA for 14- 
18 hours. At the end of the transfection the medium was replaced with DMEM 
medium containing low serum (1 % FBS). The cells were then incubated for 24- 
48 hours prior to harvesting. 

The CD39-L4-6His was concentrated by treating the cell lysates 
and medium with Nickel-NTA agarose (Qiagen) followed by SDS/PAGE and 
immunoblot analysis with an antibody against the Arg-Gly-Ser-6His epitope. 
Cells were washed twice with PBS containing 0.5 //g/ml leupeptin, 0.7 //g/ml 
pepstatin and 0.2 jug/ml aprotinin. After a brief sonication and centrifugation 
step to clear the lysate, the samples were then incubated with a Nickel-NTA 
resin at A°C for 2-3 hours. The histidine-tagged protein complexed to the resin 
was washed three times with PBS before loading onto a 10% SDS/PAGE gel for 
Western blot analysis. CD39-L4 was detected in both the cell lysate and the 
medium from cells transfected with the CD39-L4-6His expression vector, but not 
from control cells. While the predicted molecular weight of CD39-L4-6His is 46 
kDa, the immunoreactive protein exhibited a mobility by SDS/PAGE 
corresponding to a molecular mass of approximately 51 kDa in the media and 
approximately 48 kDa in the cell lysate. The difference in apparent molecular 
weight may be due to posttranslational modications of three potential N- 
glycosylation sites in the CD39-L4 predicted amino acid sequence. 
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Secretion of CD39-L4 was also examined by treatment of the 
transfected cells with brefeldin A, an inhibitor of translocation of secretory 
proteins from the endoplasmic reticulum to the Golgi apparatus. Chadwick, et 
al., Genomics 50:357-367 (1998). Brefeldin A was dissolved in ethanol and 
5 added to the transfected cells 48 hours after transfection. Both control and 
brefeldin A treated cells were washed once with PBS and incubated for 8 hours 
in medium with none or varying dosages of brefeldin A. Increasing dosages of 
brefeldin A blocked secretion of CD39-L4-6His and led to massive intracellular 
accumulation. 

10 

Example 7 
Site-directed mutagenesis of CD39L4 

Site directed mutagenesis was employed to increase the enzymatic 
activity of CD39L4. Amino acid sequence comparisons between CD39 family 

15 members reveal four highly homologous regions in all five human members 
(Chadwick and Frischauf, Genomics 50:357-367, 1998). These regions, termed 
apyrase-conserved regions (ACRs), are present not only in the CD39 family 
members but other apyrases from species as distant as yeast and plants. 
Examination of similarities and differences in the CD39 ACRs led to the design 

20 of three CD39L4 mutants (see Figure 5). In these mutants, codons encoding 
CD39 ACR specific residues were used to replace codons from the CD39L4 wild 
type ACR sequence. Only residues with significantly different structural or 
chemical properties were replaced. A PCR based approach was used to 
produce these mutations. 

25 Briefly, the expression vector pCDNA3. 1 (Invitrogen) containing the 

full coding sequence of the CD39L4 gene (with a 6 Histidine tag inserted after 
Arg 24 in the coding sequence to allow purification of the secreted mature form 
of the protein) was subjected to a PCR-based site-directed mutagenesis 
approach using overlapping oligonucleotides [CD39-L4 ACR I mutant (nt 177- 

30 148 and 160-204): 5'-GTG AGT GCT CCC TGC ATC TAA CAT AAT TCC-3' 
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(SEQ ID NO: 12) and 5'-GAT GCA GGG AGC ACT CAC ACT AGT ATT CAT 
GTTTAC ACC TTT GTG-3' (SEQ ID NO: 13); CD39-L4 ACRII mutant (nt 402- 
359 and 385-41 5): 5'-GCG TAG TCC TGC TGT TGC CCC TAG GTA CAC TGG 
GGT CTT TTT CC-3* (SEQ ID NO: 14) and 5'-GCA ACA GCA GGA CTA CGC 
5 TTA CTG CCA GAA C-3' (SEQ ID NO: 15); and CD39-L4 ACR III mutant (nt 
532-485 and 513-540): 5'-CCC AAG CGA ATA TGC CTT CGT CTT GTC CAG 
TCA TGA TGC TAA CAC TGC-3' (SEQ ID NO: 16) and 5'-CGA AGG CAT ATT 
CGC TTG GGT TAC TGT G-3' (SEQ ID NO: 17)]. Atter amplification of the 
whole plasmid with Pfu DMA polymerase (Stratagene) (95*'C/1 min; SCC/l min; 

10 72''C/15 min for 12 cycles), the methylated parental DNA was digested with the 
restriction enzyme Dpnl, leaving only the unmethylated PCR amplified products. 
The resulting annealed double-stranded nicked products were then transformed 
into bacteria and the resulting colonies were screened for the desired mutations 
by sequencing. The subsequent constructs were fully sequenced to verify that 

15 the mutations were in fact introduced and that no extraneous mutations were 
generated. 

Example 8 

ACR III mutant increases ADPase activity 
20 Plasmids containing the mutated and wild type forms of the 

CD39L4 gene were transfected into COS-7 cells. After two days, protein was 
purified from the culture medium using a Nickel-NTA resin approach to 
concentrate the tagged proteins. These proteins were then assayed for ATPase 
and ADPase activity by measuring the inorganic phosphate released (Wang T-F 
25 et al., J. Biol. Chem. 273:24814-24821. 1998). The proteins were incubated in 
apyrase buffer (15 mM Tris pH 7.4, 135 mM NaCI, 2mM EGTA and 10 mM 
glucose) for 1 hour at 37 "C with or without 2 mM CaClj or 2 mM MgClj. 
Phosphatase reactions were initiated by the addition of ADP or ATP to a final 
concentration of 1 mM. The reaction of inorganic phosphorus with ammonium 
30 molybdate in the presence of sulfuric acid, produces an unreduced 
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phosphomolybdate complex. The absorbance of this complex at 340 nm is 
directly proportional to the inorganic phosphorus concentration (Daly J. A,, and 
Ertingshausen G. Clin. Chem. 18:263 (1972) (Sigma Diagnostics)). 

As seen in Figure 7, mutations in ACR I and II eliminate activity, 
5 whereas the mutations in ACR III increase activity six-fold over wild type. This 
increased activity therefore offers a greater therapeutic potential, as less protein 
could be administered to offer the same pharmacological effect. The 
replacement of three amino acids in the ill region (amino acids 167 to 181 in 
CD39-L4) and the resulting increase in ADPase activity predicts that 

10 replacement of additional amino acids within this region by amino acids from the 
equivalent region of CD39 may also enhance the activity of the protein over wild 
type CD39L4. The increase in ADPase activity over wild type may also be due 
to the replacement of only one or two of the three amino acids; this can be 
confirmed by replacing one or two amino acids at a time. 

15 The polynucleotide and amino acid sequences of a CD39-L4 

variant termed ACRIll and having the amino acid substitutions D168-T, S170-Q 
and L175-F compared to wild type CD39-L4 (SEQ ID NO: 5) are set forth in 
SEQ ID NOs: 6 and 7, respectively, and in Figure 6. 

20 Example 9 

ACR III mutant and wild type fomfis are specific for ADP and not ATP 

Both the CD39L4 wild type and the CD39L4 variant with mutations 
in the ACRIll region hydrolyze ADP. However, when ATP was tested as a 
substrate, neither the CD39L4 nor the CD39L4 variant catalyzed hydrolysis. In 

25 contrast, CD39 as a membrane bound molecule (Marcus et al.. The Journal of 
Clinical Investigation, 99: 1351-1360) or as a genetically engineered soluble 
form (Gayle et al., The Journal of Clinical Investigation, 101:1851-1858, 1998) 
is able to hydrolyze both ATP and ADP substrates efficiently. The specificity 
that both CD39L4 wild type and the CD39L4 variant with mutations in the ACRIll 

30 region have for ADP is an advantageous feature that makes these CD39L4-type 



56 



molecules better antiplatelet therapeutic candidates than CD39, as ADP is the 
agonist that causes platelet aggregation. Therapeutics that have both ADPase 
and ATPase activities potentially could create adverse side effects by interfering 
with levels of ATP in the circulation. 

5 

Example 10 
Organization of the human CD39-L4 gene 
A human CITB BAG genomic library (Research Genetics) was 
screened with gene specific primers [246-16 (nt 5522-5543), 
10 5'-CTTCCTTCACTGGGAATTCAGG-3' (SEQ ID NO: 1 8) and 246-K4 (nt 4922- 
4945), 5'-CTGTTTACCGAGATGGTTGGAAGC-3' (SEQ ID NO: 19)] using a 
PGR based assay. 

Briefly, gene specific primers were used to screen pools of BAG 
DNAs. BAG pools that produced a amplified DNA fragment of the predicted size 
15 were pursued until an individual BAG was identified. BAG63-I18 was isolated 
and sequenced with gene specific primers for the CD39-L4 cDNA, as well as 
intron specific primers. The CD39-L4 coding sequence was found to be 
distributed over 10 exons spanning 9.3 kb of genomic DNA as set out in SEQ ID 
NO: 8. 

20 

Example 11 
CD39-L4 is stimulated by divalent cations 

The high degree of conservation in the apyrase conserved 
regions of CD39-L4 suggests similar function to other apyrases. To test this 

25 hypothesis, GOS-7 cells were transfected with the GD39-L4-6His construct as 
described above. The medium from transfected cells was incubated with 
Nickel-NTA resin (Qiagen) in order to capture the 6His tagged protein, the 
resin was washed with assay buffer (buffer A, 15 mM Tris pH 7.5, 134 mM 
NaCI and 5 mM glucose) and the protein still tethered to the resin in a 

30 suspension was assayed for ADPase activity. Nucleotidase activity was 
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determined by measuring the amount of inorganic phosphate released from 
nucleotide substrates using the technique of Dlay and Ertingshausen, Clin. 
Chem. 18:263-265 (1972). In this reaction the complex of inorganic 
phosphorus with phosphor reagent (ammonium molybdate in the presence of 
sulfuric acid) produces an unreduced phosphomolybdate compound. The 
absorbance of this complex at 340 nm is directly proportional in the inorganic 
phosphorus concentration. The protein still tethered to the resin as a 30% 
suspension in buffer A was assayed by the addition of the nucleotide to a 
final concentration of 1 mM and incubated at 37°C for 30 minutes. The 
reaction was stopped by adding 100 volumes of phosphor reagent. The 
amount of phosphate released from the reaction was quantified using a 
calcium/phosphorus combined standard (Sigma). The amount of CD39-L4 
protein used in the assays was estimated by comparing the intensity of the 
CD39-L4 band in Western blots with that of a series of standards of known 
quantity. CD39-L4 protein from transfected cells displayed a 2.3 fold 
increase in activity over the cells transfected with the vector alone. When 
Ca^* and Mg^* were added, the activity increased 3.6 fold and 6 fold, 
respectively. 

Example 12 
Characterization of CD39-L4 activity 
CD39-L4 protein was assayed for ADPase activity in the 
presence of different kinds of inhibitors of ADPases. Control ecto-apyrase 
activity was determined with protein tethered to the Nickel-NTA resin. Both 
assays were performed as described above except the protein was in buffer A 
containing 2 mM CaClj and 2 mM MgClj. As shown by Table 1 below, 
inhibitors of phosphatases (F") and adenylate kinase (Ap5A) did not inhibit 
activity. The inhibitors of vacuolar ATPases (NEM), mitochondrial ATPases 
(N3-) and Na*. K*, ATPase (oubain) did not significantly inhibit the Ca^* and 
Mg^* stimulated activity. However, metal chelators (EDTA and EGTA) 
significantly inhibited activity. These results show that the overwhelming 
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majority of the activity in the assays originates from a protein bound to the 
resin with characteristics of an E-type apyrase. 

Table 1 

Inhibition of CD39-L4 activity 



INHIBITORS 


% OF CONTROL 


Control 


100 ±7 


Ouabain (1 mM) 


96 ±6 


NEM(IOmM) 


106 ±5 


N3- (1 mM) 


100 ±12 


F-(IOmM) 


113±5 


Ap5A(10yuM) 


121 ±9 


EGTA (2 mM) 


35 ±3 


EDTAJ2 mMI 


52±3 



As shown in Table 2 below, the nucleotide specificity of CD39- 



15 L4 was also assayed as described above. The CD39-L4 activity was 

determined with protein tethered to the Ni-NTA resin. The protein was in 
buffer A containing 1 mM EGTA, as well as 2 mM CaClj and MgClj. The 
assay was started by adding the nucleotides to a final concentration of 1 mM. 
The values below are expressed relative to ADP. The relative activity of the 

20 nucleotide triphosphates varies almost seven-fold with ATP being the poorest 
substrate. No phosphate release was detected with AMP and ADP was 
hydrolyzed at a rate approximately twenty-fold higher than ATP. The other 
nucleotide diphosphates (GDP and UDP) were also very efficiently 
hydrolyzed by CD39-L4. These results indicate that CD39-L4 defines a new 

25 class of E-type apyrase in humans with a specificity for NDPs as enzymatic 
substrates. 
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Table 2 

Substrate specificity of CD39-L4 



NUCLEOTIDE 


% OF CONTROL 


ADP 


100 ±15 


ATP 


5±1 


AMP 


0 1 


CTP 


26 ±2 


GTP 


34 ± 1 1 


UTP 


12±4 


CDP 


268 ± 1 1 


GDP 


334 ± 38 


UDP 


408 ± 14 



Example 13 

15 Glycosylation is not essential for CD39-L4 activity 

Posttranslational modifications such as N-linked glycosylation 
are common in secreted and membrane-bound mammalian proteins. These 
modifications may be important for correct protein folding or enzymatic 
activity and are not easily reproduced when the proteins are expressed in 
20 other organisms such as bacteria. In order to test whether CD39-L4 is 
glycosylated, COS-7 cells, transfected as described in Example 43, were 
treated with tunicamycin (Sigma), which blocks the formation of N-glycosidIc 
linkages. 

COS-7 cells were grown to 75% confluency and transfected with 
25 the CD39-L4-6His construct. After 24 hours, a fraction of the COS-7 cells 

were treated with Tunicamycin at a concentration of 5 f^g/ml. The media was 
replaced again after 24 hours with fresh tunicamycin and harvested after 48 
hours. The CD39-L4-6His protein was concentrated by treating the media 
with Nickel-NTA agarose (Qiagen). The resin was washed with assay buffer 
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and the protein still tethered to the resin in a suspension was assayed for a 
shift in electrophoretic mobility as well as Its ADPase activity. 

Western blot analysis using an antibody against the 6-His 
epitope revealed that the glycosylated CD39-L4 protein isolated from the 
control cells had an approximate size of 51 kDa. However, tunicamycin 
treated cells had a molecular weight of approximately 46 kDa indicating that 
the protein was deglycosylated. 

ADPase activity of the tunicamycin treated cells was assayed as 
described in Example 13 above. The deglycosylated CD39-L4 protein had 
ADPase activity comparable to an equal amount of the glycosylated protein 
isolated from control cells. This demonstrates that glycosylation of the 
protein is not important for ADPase activity. 

The present invention is not to be limited in scope by the 
exemplified embodiments which are intended as illustrations of single aspects 
of the invention, and compositions and methods which are functionally 
equivalent are within the scope of the invention. Indeed, numerous 
modifications and variations in the practice of the invention are expected to 
occur to those skilled in the art upon consideration of the present preferred 
embodiments. Consequently, the only limitations which should be placed 
upon the scope of the invention are those which appear in the appended 
claims. 

All references cited within the body of the instant specification 
are hereby incorporated by reference in their entirety. 
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